First, let’s load some packages including HTSSIP
.
Also let’s get an overview of the phyloseq object that we’re going to use.
## phyloseq-class experiment-level object
## otu_table() OTU Table: [ 1072 taxa and 139 samples ]
## sample_data() Sample Data: [ 139 samples by 17 sample variables ]
## tax_table() Taxonomy Table: [ 1072 taxa by 8 taxonomic ranks ]
## phy_tree() Phylogenetic Tree: [ 1072 tips and 1071 internal nodes ]
See HTSSIP introduction vignette for a description on why dataset parsing is needed.
Let’s the parameters for parsing the dataset into individual treatment-control comparisons.
## Substrate Day
## 12C-Con.D14.R1_F17 12C-Con 14
## 13C-Cel.D3.R1_F21 13C-Cel 3
## 13C-Cel.D14.R1_F12 13C-Cel 14
## 13C-Glu.D14.R1_F19 13C-Glu 14
## 12C-Con.D3.R3_F11 12C-Con 3
## 13C-Glu.D3.R1_F27 13C-Glu 3
We just need the parameters for each treatment, so let’s filter out the controls. In this case, the controls are all ‘12C-Con’.
## Substrate Day
## 13C-Cel.D3.R1_F21 13C-Cel 3
## 13C-Cel.D14.R1_F12 13C-Cel 14
## 13C-Glu.D14.R1_F19 13C-Glu 14
## 13C-Glu.D3.R1_F27 13C-Glu 3
Now, we will use an expression that will subset the phyloseq object into the comparisons that we want to make.
ex
is the expression that will be used for pruning the
phyloseq object
ex = "(Substrate=='12C-Con' & Day=='${Day}') | (Substrate=='${Substrate}' & Day == '${Day}')"
physeq_S2D2_l = phyloseq_subset(physeq_S2D2, params, ex)
physeq_S2D2_l
## $`(Substrate=='12C-Con' & Day=='3') | (Substrate=='13C-Cel' & Day == '3')`
## phyloseq-class experiment-level object
## otu_table() OTU Table: [ 1072 taxa and 46 samples ]
## sample_data() Sample Data: [ 46 samples by 17 sample variables ]
## tax_table() Taxonomy Table: [ 1072 taxa by 8 taxonomic ranks ]
## phy_tree() Phylogenetic Tree: [ 1072 tips and 1071 internal nodes ]
##
## $`(Substrate=='12C-Con' & Day=='14') | (Substrate=='13C-Cel' & Day == '14')`
## phyloseq-class experiment-level object
## otu_table() OTU Table: [ 1072 taxa and 46 samples ]
## sample_data() Sample Data: [ 46 samples by 17 sample variables ]
## tax_table() Taxonomy Table: [ 1072 taxa by 8 taxonomic ranks ]
## phy_tree() Phylogenetic Tree: [ 1072 tips and 1071 internal nodes ]
##
## $`(Substrate=='12C-Con' & Day=='14') | (Substrate=='13C-Glu' & Day == '14')`
## phyloseq-class experiment-level object
## otu_table() OTU Table: [ 1072 taxa and 47 samples ]
## sample_data() Sample Data: [ 47 samples by 17 sample variables ]
## tax_table() Taxonomy Table: [ 1072 taxa by 8 taxonomic ranks ]
## phy_tree() Phylogenetic Tree: [ 1072 tips and 1071 internal nodes ]
##
## $`(Substrate=='12C-Con' & Day=='3') | (Substrate=='13C-Glu' & Day == '3')`
## phyloseq-class experiment-level object
## otu_table() OTU Table: [ 1072 taxa and 46 samples ]
## sample_data() Sample Data: [ 46 samples by 17 sample variables ]
## tax_table() Taxonomy Table: [ 1072 taxa by 8 taxonomic ranks ]
## phy_tree() Phylogenetic Tree: [ 1072 tips and 1071 internal nodes ]
Now, let’s actually make ordinations of each treatment compared to
the control. This will return a data.frame
object for
plotting.
# running in parallel
doParallel::registerDoParallel(2)
physeq_S2D2_l_df = SIP_betaDiv_ord(physeq_S2D2_l, parallel=TRUE)
## Run 0 stress 0.05999876
## Run 1 stress 0.05999868
## ... New best solution
## ... Procrustes: rmse 3.375356e-05 max resid 9.618937e-05
## ... Similar to previous best
## Run 2 stress 0.05999857
## ... New best solution
## ... Procrustes: rmse 0.0002732117 max resid 0.000775849
## ... Similar to previous best
## Run 3 stress 0.05999877
## ... Procrustes: rmse 8.500827e-05 max resid 0.0002406656
## ... Similar to previous best
## Run 4 stress 0.05999872
## ... Procrustes: rmse 0.0002915895 max resid 0.0008291217
## ... Similar to previous best
## Run 5 stress 0.05999849
## ... New best solution
## ... Procrustes: rmse 0.0001366149 max resid 0.0003880624
## ... Similar to previous best
## Run 6 stress 0.05999881
## ... Procrustes: rmse 0.0002198795 max resid 0.0006218413
## ... Similar to previous best
## Run 7 stress 0.05999863
## ... Procrustes: rmse 0.0001696464 max resid 0.0004818257
## ... Similar to previous best
## Run 8 stress 0.0599986
## ... Procrustes: rmse 0.0001507035 max resid 0.0004282821
## ... Similar to previous best
## Run 9 stress 0.05999851
## ... Procrustes: rmse 8.524178e-05 max resid 0.0002424407
## ... Similar to previous best
## Run 10 stress 0.05999882
## ... Procrustes: rmse 0.0001811825 max resid 0.0005133753
## ... Similar to previous best
## Run 11 stress 0.05999878
## ... Procrustes: rmse 0.0002258468 max resid 0.0006408356
## ... Similar to previous best
## Run 12 stress 0.05999855
## ... Procrustes: rmse 0.0001225478 max resid 0.0003482748
## ... Similar to previous best
## Run 13 stress 0.0599988
## ... Procrustes: rmse 0.0002325677 max resid 0.0006597193
## ... Similar to previous best
## Run 14 stress 0.05999855
## ... Procrustes: rmse 0.0001174833 max resid 0.0003373234
## ... Similar to previous best
## Run 15 stress 0.05999879
## ... Procrustes: rmse 0.0002314273 max resid 0.0006570317
## ... Similar to previous best
## Run 16 stress 0.05999888
## ... Procrustes: rmse 0.0002080716 max resid 0.0005929058
## ... Similar to previous best
## Run 17 stress 0.05999872
## ... Procrustes: rmse 0.0002079436 max resid 0.0005897428
## ... Similar to previous best
## Run 18 stress 0.05999869
## ... Procrustes: rmse 0.0001826661 max resid 0.000517696
## ... Similar to previous best
## Run 19 stress 0.2366197
## Run 20 stress 0.05999857
## ... Procrustes: rmse 0.000132549 max resid 0.0003781502
## ... Similar to previous best
## *** Best solution repeated 15 times
## Run 0 stress 0.0719636
## Run 1 stress 0.1319373
## Run 2 stress 0.1296168
## Run 3 stress 0.1010428
## Run 4 stress 0.07206107
## ... Procrustes: rmse 0.005102571 max resid 0.02294879
## Run 5 stress 0.07196353
## ... New best solution
## ... Procrustes: rmse 4.210701e-05 max resid 0.0001059697
## ... Similar to previous best
## Run 6 stress 0.1283983
## Run 7 stress 0.1329551
## Run 8 stress 0.1374125
## Run 9 stress 0.1318293
## Run 10 stress 0.1361472
## Run 11 stress 0.1199624
## Run 12 stress 0.1306889
## Run 13 stress 0.07393819
## Run 14 stress 0.1172224
## Run 15 stress 0.07196353
## ... New best solution
## ... Procrustes: rmse 1.499708e-05 max resid 7.828606e-05
## ... Similar to previous best
## Run 16 stress 0.07196357
## ... Procrustes: rmse 3.231024e-05 max resid 8.906342e-05
## ... Similar to previous best
## Run 17 stress 0.07457532
## Run 18 stress 0.07404651
## Run 19 stress 0.1186496
## Run 20 stress 0.07196364
## ... Procrustes: rmse 0.0001605849 max resid 0.000469045
## ... Similar to previous best
## *** Best solution repeated 3 times
## Run 0 stress 0.07255501
## Run 1 stress 0.07255503
## ... Procrustes: rmse 3.528305e-05 max resid 0.0001201565
## ... Similar to previous best
## Run 2 stress 0.07255501
## ... Procrustes: rmse 8.818126e-06 max resid 3.144746e-05
## ... Similar to previous best
## Run 3 stress 0.07255501
## ... Procrustes: rmse 8.016805e-06 max resid 2.850354e-05
## ... Similar to previous best
## Run 4 stress 0.07255501
## ... Procrustes: rmse 1.393761e-05 max resid 5.245768e-05
## ... Similar to previous best
## Run 5 stress 0.07255501
## ... Procrustes: rmse 6.882707e-06 max resid 2.856283e-05
## ... Similar to previous best
## Run 6 stress 0.07255501
## ... Procrustes: rmse 1.433811e-05 max resid 4.715108e-05
## ... Similar to previous best
## Run 7 stress 0.07255501
## ... Procrustes: rmse 2.148759e-06 max resid 1.131891e-05
## ... Similar to previous best
## Run 8 stress 0.07255502
## ... Procrustes: rmse 2.617178e-05 max resid 9.777244e-05
## ... Similar to previous best
## Run 9 stress 0.07255501
## ... Procrustes: rmse 3.286859e-06 max resid 1.474209e-05
## ... Similar to previous best
## Run 10 stress 0.07255502
## ... Procrustes: rmse 2.257142e-05 max resid 7.488549e-05
## ... Similar to previous best
## Run 11 stress 0.07255501
## ... Procrustes: rmse 4.89976e-06 max resid 1.788823e-05
## ... Similar to previous best
## Run 12 stress 0.07255502
## ... Procrustes: rmse 1.722529e-05 max resid 5.634782e-05
## ... Similar to previous best
## Run 13 stress 0.07255501
## ... Procrustes: rmse 3.915431e-06 max resid 1.688773e-05
## ... Similar to previous best
## Run 14 stress 0.07255505
## ... Procrustes: rmse 4.298506e-05 max resid 0.000145347
## ... Similar to previous best
## Run 15 stress 0.07255501
## ... Procrustes: rmse 1.006472e-05 max resid 3.997473e-05
## ... Similar to previous best
## Run 16 stress 0.07255502
## ... Procrustes: rmse 1.110403e-05 max resid 4.277427e-05
## ... Similar to previous best
## Run 17 stress 0.07255501
## ... Procrustes: rmse 6.615042e-06 max resid 2.476704e-05
## ... Similar to previous best
## Run 18 stress 0.07255501
## ... Procrustes: rmse 1.313579e-05 max resid 5.134526e-05
## ... Similar to previous best
## Run 19 stress 0.07255502
## ... Procrustes: rmse 2.212686e-05 max resid 7.671359e-05
## ... Similar to previous best
## Run 20 stress 0.07255503
## ... Procrustes: rmse 2.800639e-05 max resid 0.0001054077
## ... Similar to previous best
## *** Best solution repeated 20 times
## Run 0 stress 0.1080812
## Run 1 stress 0.10136
## ... New best solution
## ... Procrustes: rmse 0.03470367 max resid 0.1337577
## Run 2 stress 0.108081
## Run 3 stress 0.10136
## ... New best solution
## ... Procrustes: rmse 0.0004028729 max resid 0.001479001
## ... Similar to previous best
## Run 4 stress 0.108081
## Run 5 stress 0.1080812
## Run 6 stress 0.1013599
## ... New best solution
## ... Procrustes: rmse 2.982353e-05 max resid 0.0001104086
## ... Similar to previous best
## Run 7 stress 0.1013598
## ... New best solution
## ... Procrustes: rmse 0.0002389074 max resid 0.0008782957
## ... Similar to previous best
## Run 8 stress 0.1080814
## Run 9 stress 0.1013603
## ... Procrustes: rmse 0.0002179867 max resid 0.0008009432
## ... Similar to previous best
## Run 10 stress 0.1080813
## Run 11 stress 0.1080813
## Run 12 stress 0.1080813
## Run 13 stress 0.1013598
## ... Procrustes: rmse 1.866392e-05 max resid 6.641849e-05
## ... Similar to previous best
## Run 14 stress 0.1080814
## Run 15 stress 0.1080816
## Run 16 stress 0.1080811
## Run 17 stress 0.1013601
## ... Procrustes: rmse 0.0001742879 max resid 0.0006388473
## ... Similar to previous best
## Run 18 stress 0.1080813
## Run 19 stress 0.1450322
## Run 20 stress 0.1013601
## ... Procrustes: rmse 0.0001467166 max resid 0.0005381576
## ... Similar to previous best
## *** Best solution repeated 5 times
## phyloseq_subset
## 1 (Substrate=='12C-Con' & Day=='3') | (Substrate=='13C-Cel' & Day == '3')
## 2 (Substrate=='12C-Con' & Day=='3') | (Substrate=='13C-Cel' & Day == '3')
## 3 (Substrate=='12C-Con' & Day=='3') | (Substrate=='13C-Cel' & Day == '3')
## NMDS1 NMDS2 X.Sample primer_number fwd_barcode
## 1 -0.15454387 -0.03536686 13C-Cel.D3.R1_F21 7 GGATATCT
## 2 -0.20505811 -0.10166080 13C-Cel.D3.R1_F20 6 CGTGAGTG
## 3 -0.02323964 -0.00216143 13C-Cel.D3.R1_F15 1 ATCGTACG
## rev_barcode Substrate Day Microcosm_replicate Fraction Buoyant_density
## 1 CGAGAGTT 13C-Cel 3 1 21 1.705640
## 2 CGAGAGTT 13C-Cel 3 1 20 1.706186
## 3 CGAGAGTT 13C-Cel 3 1 15 1.725310
## Sample_type
## 1 unknown
## 2 unknown
## 3 unknown
## library
## 1 /var/seq_data/fullCyc/MiSeq_16SrRNA/515f-806r/lib1-7rs/../150721_lib4/run1/metadata.txt
## 2 /var/seq_data/fullCyc/MiSeq_16SrRNA/515f-806r/lib1-7rs/../150721_lib4/run1/metadata.txt
## 3 /var/seq_data/fullCyc/MiSeq_16SrRNA/515f-806r/lib1-7rs/../150721_lib4/run1/metadata.txt
## Exp_type Sample_location Sample_date Sample_treatment Sample_subtreatment
## 1 SIP NA NA NA NA
## 2 SIP NA NA NA NA
## 3 SIP NA NA NA NA
## core_dataset
## 1 TRUE
## 2 TRUE
## 3 TRUE
Each specific phyloseq subset (treatment-control comparison) is delimited with the “phyloseq_subset” column.
## [1] (Substrate=='12C-Con' & Day=='3') | (Substrate=='13C-Cel' & Day == '3')
## [2] (Substrate=='12C-Con' & Day=='14') | (Substrate=='13C-Cel' & Day == '14')
## [3] (Substrate=='12C-Con' & Day=='14') | (Substrate=='13C-Glu' & Day == '14')
## [4] (Substrate=='12C-Con' & Day=='3') | (Substrate=='13C-Glu' & Day == '3')
## 4 Levels: (Substrate=='12C-Con' & Day=='3') | (Substrate=='13C-Cel' & Day == '3') ...
For clarity, I’m going edit these long strings to make them more readable.
physeq_S2D2_l_df = physeq_S2D2_l_df %>%
dplyr::mutate(phyloseq_subset = gsub(' \\| ', '\n', phyloseq_subset),
phyloseq_subset = gsub('\'3\'', '\'03\'', phyloseq_subset))
physeq_S2D2_l_df %>% .$phyloseq_subset %>% unique
## [1] "(Substrate=='12C-Con' & Day=='03')\n(Substrate=='13C-Cel' & Day == '03')"
## [2] "(Substrate=='12C-Con' & Day=='14')\n(Substrate=='13C-Cel' & Day == '14')"
## [3] "(Substrate=='12C-Con' & Day=='14')\n(Substrate=='13C-Glu' & Day == '14')"
## [4] "(Substrate=='12C-Con' & Day=='03')\n(Substrate=='13C-Glu' & Day == '03')"
OK, let’s plot the data!
As you can see, the ‘heavy’ gradient fraction ‘communities’ for the labeled-treatments tend to diverge from the unlabeled gradient fraction communities, but the amount of divergence in dependent on substrate and time point.
## R version 4.4.2 (2024-10-31)
## Platform: x86_64-pc-linux-gnu
## Running under: Ubuntu 24.04.1 LTS
##
## Matrix products: default
## BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
## LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.26.so; LAPACK version 3.12.0
##
## locale:
## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=en_US.UTF-8 LC_COLLATE=C
## [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
## [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
##
## time zone: Etc/UTC
## tzcode source: system (glibc)
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] phyloseq_1.51.0 HTSSIP_1.4.1 ggplot2_3.5.1 tidyr_1.3.1
## [5] dplyr_1.1.4 rmarkdown_2.29
##
## loaded via a namespace (and not attached):
## [1] ade4_1.7-22 tidyselect_1.2.1
## [3] farver_2.1.2 Biostrings_2.75.1
## [5] fastmap_1.2.0 lazyeval_0.2.2
## [7] digest_0.6.37 lifecycle_1.0.4
## [9] cluster_2.1.6 survival_3.7-0
## [11] magrittr_2.0.3 compiler_4.4.2
## [13] rlang_1.1.4 sass_0.4.9
## [15] tools_4.4.2 igraph_2.1.1
## [17] utf8_1.2.4 yaml_2.3.10
## [19] data.table_1.16.2 knitr_1.49
## [21] S4Arrays_1.7.1 labeling_0.4.3
## [23] DelayedArray_0.33.2 plyr_1.8.9
## [25] abind_1.4-8 BiocParallel_1.41.0
## [27] withr_3.0.2 purrr_1.0.2
## [29] BiocGenerics_0.53.3 sys_3.4.3
## [31] grid_4.4.2 stats4_4.4.2
## [33] fansi_1.0.6 multtest_2.63.0
## [35] biomformat_1.35.0 colorspace_2.1-1
## [37] Rhdf5lib_1.29.0 scales_1.3.0
## [39] iterators_1.0.14 MASS_7.3-61
## [41] SummarizedExperiment_1.37.0 cli_3.6.3
## [43] vegan_2.6-8 crayon_1.5.3
## [45] generics_0.1.3 httr_1.4.7
## [47] reshape2_1.4.4 ape_5.8
## [49] cachem_1.1.0 rhdf5_2.51.0
## [51] stringr_1.5.1 zlibbioc_1.52.0
## [53] splines_4.4.2 parallel_4.4.2
## [55] XVector_0.47.0 matrixStats_1.4.1
## [57] vctrs_0.6.5 Matrix_1.7-1
## [59] jsonlite_1.8.9 IRanges_2.41.1
## [61] S4Vectors_0.45.2 coenocliner_0.2-3
## [63] maketools_1.3.1 locfit_1.5-9.10
## [65] foreach_1.5.2 jquerylib_0.1.4
## [67] glue_1.8.0 codetools_0.2-20
## [69] stringi_1.8.4 gtable_0.3.6
## [71] GenomeInfoDb_1.43.0 GenomicRanges_1.59.0
## [73] UCSC.utils_1.3.0 munsell_0.5.1
## [75] tibble_3.2.1 pillar_1.9.0
## [77] htmltools_0.5.8.1 rhdf5filters_1.19.0
## [79] GenomeInfoDbData_1.2.13 R6_2.5.1
## [81] doParallel_1.0.17 evaluate_1.0.1
## [83] lattice_0.22-6 Biobase_2.67.0
## [85] bslib_0.8.0 Rcpp_1.0.13-1
## [87] SparseArray_1.7.2 nlme_3.1-166
## [89] permute_0.9-7 mgcv_1.9-1
## [91] DESeq2_1.47.1 xfun_0.49
## [93] MatrixGenerics_1.19.0 buildtools_1.0.0
## [95] pkgconfig_2.0.3