Poster session and reception

Wednesday, November 7, 2018 - 5:00pm - 6:00pm
Lind 400

Zhengling Qi (University of North Carolina, Chapel Hill)


  • Estimiting Individualized Decision Rules Wwith Tail Controls
    Zhengling Qi (University of North Carolina, Chapel Hill)
    With the emergence of precision medicine, estimating optimal individualized decision rules (IDRs) has attracted tremendous attentions in many scientific areas. Most existing literature has focused on finding optimal IDRs that can maximize the expected outcome for each individual. Motivated by complex individualized decision making procedures and popular conditional value at risk (CVaR) measures, we propose two new robust criteria to estimate optimal IDRs: one is to control the average lower tail of the subjects' outcomes and the other is to control the individualized lower tail of each subject's outcome. In addition to optimizing the individualized expected outcome, our proposed criteria take risks into consideration, and thus the resulting IDRs can prevent adverse events caused by the heavy lower tail of the outcome distribution. Interestingly, from the perspective of duality theory, the optimal IDR under our criteria can be interpreted as the decision rule that maximizes the “worst-case scenario of the individualized outcome within a probability constrained set. The corresponding estimating procedures are implemented using two proposed efficient non-convex optimization algorithms, which are based on the recent developments of difference-of-convex (DC) and majorization-minimization (MM) algorithms that can be shown to converge to the sharpest stationary points of the criteria. We provide a comprehensive statistical analysis for our estimated optimal IDRs under the proposed criteria such as consistency and finite sample error bounds. Simulation studies and a real data application are used to further demonstrate the robust performance of our methods.
  • Variance-Regularized Policy Learning
    Weibin Mo (University of North Carolina, Chapel Hill)
    Off-line policy learning aims to prescribe optimal individualized treatment rule (ITR, as a mapping from covariate space to designated actions) based on a batch data of covariates, treatments and outcomes, which are collected from randomized control trial (RCT) or observational study. Recent development has turned the learning problem into an empirical outcome-weighted risk minimization formulation, which might potentially suffer from instability of stochastic weights. We propose a variance-regularized version to address this issue, and show empirical gains in regret reduction for unbalanced optimal treatment setting. Our experiment demonstrates that our pipeline advocates more stable rules in prediction while maintaining better robustness across all replications.
  • Mapping immunogenic epitopes in tumor-associated neopeptides generated by genomic rearrangement
    Julia Udell (Mayo Clinic)
    Tumor-specific neoepitopes derived from frameshift mutations or structural rearrangements are more likely to be immunogenic than those derived from single nucleotide variations, and correlate with better responses to immunotherapy than single nucleotide variations. As advanced genomic testing becomes more commonplace in cancer profiling, prediction of immunogenic neoantigens and interpretable illustration of what drives their immunogenicity become increasingly critical. Here we introduce a standardized epitope map: a method of quantifying and visualizing immunogenic regions in tumor-derived neopeptides.
  • Causal Estimators for a Primary Data Source with Borrowing from a Supplemental Source
    Jeffrey Boatman (University of Minnesota, Twin Cities)
    Borrowing data from supplemental data sources for the analysis of a primary data source can potentially result in a more precise estimator for causal effects. However, borrowing information from multiple clinical trials to estimate causal effects is challenging. Borrowing information is most natural in a Bayesian setting, but methods for addressing confounding, such as inverse probability weighting or propensity score analyses, cannot be easily adapted to a Bayesian setting. Regression models can be used for causal inference in a Bayesian setting, but these requires careful consideration of model specification and assumptions. To address these issues, we propose using Bayesian Additive Regression Trees (BART) to borrow information between clinical trials. BART is well known for its strong out-of-the-box performance, and it does not require the researcher to carefully specify a regression model. Although BART is a black box method, and the model parameters are not readily interpretable, all causal inference can be done with the predicted outcomes from the model. We illustrate the performance of the causal estimators with a small simulation study to assess BART's performance on estimating the population average treatment effect and the conditional average treatment effects, and we discuss the implications for precision medicine.
  • A multitask learning approach for clustering multiple single cell RNA-seq datasets
    Zhuliu Li (University of Minnesota)
    scRNA-seq enables detailed profiling of heterogeneous cell populations and can be used to reveal lineage relationships or discover new cell types. This study focuses on developing computational methods for cross-population transcriptome analysis of multiple single-cell populations. The cross-cell-population clustering problem is different from the traditional clustering problem because single-cell populations can be collected from different patients, different samples of a tissue, or different experimental replicates. The accompanying biological and technical variation tends to dominate the signals for clustering the pooled single cells from the multiple populations. We have developed a multitask clustering method to address the cross-population clustering problem. The method simultaneously clusters each individual cell population and controls variance among the cell type centers within each cell population and across the cell populations. Our results make it evident that multi-task clustering is a promising new approach for cross-population analysis of scRNA-seq data.
  • Transfer Learning across Ontologies for Phenome-Genome Association Prediction
    Raphael Petegrosso (University of Minnesota)
    Motivation: To better predict and analyze gene associations with the collection of phenotypes organized in a phenotype ontology, it is crucial to effectively model the hierarchical structure among the phenotypes in the ontology and leverage the sparse known associations with additional training information. We first introduce Dual Label Propagation (DLP) to impose consistent associations with the entire phenotype paths in predicting phenotype-gene associations in Human Phenotype Ontology (HPO). DLP is then used as the base model in a transfer learning framework (tlDLP) to incorporate functional annotations in Gene Ontology (GO). By simultaneously reconstructing GO term-gene associations and HPO phenotype-gene associations for all the genes in a protein-protein interaction network, tlDLP benefits from the enriched training associations indirectly through relation with GO terms.
    Results: In the experiments to predict the associations between human genes and phenotypes in HPO based on human protein-protein interaction network, both DLP and tlDLP improved the prediction of gene associations with phenotype paths in HPO in cross-validation and the prediction of the most recent associations added after the snapshot of the training data. Moreover, the transfer learning through GO term-gene associations significantly improved association predictions for the phenotypes with no more specific known associations by a large margin. Examples are also shown to demonstrate how phenotype paths in phenotype ontology and transfer learning with gene ontology can improve the predictions.
  • Estimating Optimal Treatment Regime to maximize Restricted Mean Survival using Random Survival Forests
    Sanhita Sengupta (University of Minnesota, Twin Cities)
    The impact of individualized treatment regimes on patient survival is one of the crucial aspects of personalized medicine. We propose a treatment regime using the non parametric approach of survival random forest in order to maximize restricted mean survival time for right censored data. We also move on to provide an estimate of mean restricted survival time under different treatment regimes and study the theoretical properties of the optimal treatment regime. We then proceed to do a few simulations and data analysis to verify the performance of our proposed method.
  • Treatment Allocation using Multi-Armed Bandits with Covariates
    Sakshi Arya (University of Minnesota, Twin Cities)
    In clinical practice, a doctor decides which treatments are best for his/her patients. Multi-Armed Bandit with Covariates
    (MABC) is a sequential decision making algorithm for treatment allocation aimed at maximizing patient benefits. At certain times, doctors may disagree with the treatment proposed by the algorithm. In this work, we develop a system that holistically integrates the adaptive learning by the MABC algorithm with the doctor’s ability to take advantage of his/her insight.
  • Robust Confidence Intervals for Optimal Treatment Regimes
    Yunan Wu (University of Minnesota, Twin Cities)
    An optimal treatment regime is a decision rule which aims to maximize the average outcome if being applied to assign the treatment for each individual in the population. The problem of constructing confidence intervals for the parameters indexing the optimal treatment regime or the optimal value function is challenging and has been relatively little studied in the literature. Estimation of the optimal treatment regime is known to be sensitive to the specification of the outcome regression model. On the other hand, robust methods that do not rely on an outcome regression model lead to nonstandard asymptotics. To overcome the difficulty for inference, we propose a smoothed robust estimator that has an asymptotic normal distribution and does not require to specify an outcome regression model. We rigorously prove that the bootstrapped confidence intervals provide asymptotically accurate inference for both the parameters indexing the optimal treatment regime and the optimal value function. Furthermore, we present a new algorithm to calculate the proposed estimator with substantially improved speed and stability. Numerical results demonstrate the satisfactory performance of the new methods.
  • Biomarker Screening in the Learning of Individualized Treatment Rules via Net Benefit Index
    Yiwang Zhou (University of Michigan - School of Public Health)
    In the context of personalized medicine, one of the central tasks is to establish individualized treatment rules (ITRs) for patients with heterogeneous responses to different types of treatments. Motivated from a diabetes clinical trial, we consider a problem where many biomarkers are potentially useful to improve an existing treatment allocation rule, including expansion and replacement of biomarkers in the process of refinement. This calls for a screening procedure that enables us to assess added values of new biomarkers in order to derive an improved ITR. We propose a new test based on net benefit index (NBI) that quantifies gain or loss of treatment benefit due to reclassification in which the optimal labels are obtained by the technique of support vector machine (SVM), in a similar spirit to the seminal outcome weighted learning approach. We calculate p-value of the proposed NBI-based test using the bootstrap null distribution generated by stratified permutations on individual treatment arm. The proposed screening method is applicable to both single- and multiple-decision point settings, where false discovery rate (FDR) control is used to account for the multiplicity. The performance of the proposed method is evaluated by simulation studies and the motivating clinical trial study. Our results show that our NBI-based test controls false discovery rate well and achieves high rate of correct discovery (i.e. high sensitivity). In addition, this screening method demonstrates an improved correct classification rate when ITR is expanded by including selected biomarkers.