Skip to main content Skip to secondary navigation

Research

Main content start

Financial Economics

  • Deep Learning Statistical Arbitrage 
    with J. Guijarro-Ordonez and G. Zanotti
    Management Science, Revise & Resubmit 
    Crowell Memorial Second Prize 2022
    Paper, Slides, Abstract
  • Stripping the Discount Curve - a Robust Machine Learning Approach 
    with D. Filipovic and Y. Ye
    Management Science, Revise & Resubmit 
    Best Paper at the 2022 Hong Kong Conference for Fintech, AI, and Big Data in Business
    Paper, Slides, Abstract
  • Shrinking the Term Structure 
    with D. Filipovic and Y. Ye
    Paper, Slides, Abstract
  • Asset-Pricing Factors with Economic Targets 
    with S. Bryzgalova, V. DeMiguel and S. Li
    Bates White Prize for the best paper at the 2023 SoFiE Conference
    INFORMS Finance Student Best Paper Award 2023, Faculty Co-author
    Paper, Slides, Abstract
  • Missing Financial Data 
    with S. Bryzgalova, S. Lerner and M. Lettau 
    Review of Financial Studies, accepted 
    Crowell Memorial First Prize 2022 
    ICPM Research Award 2022 
    Paper, Internet Appendix, Slides, Abstract
  • Forest Through the Trees: Building Cross-Sections of Stock Returns 
    with S. Bryzgalova and J. Zhu
    Journal of Finance, forthcoming 
    Best Paper in Asset Pricing Award at the SFS Cavalcade 2020
    Paper, Internet Appendix, Slides, Abstract
  • Machine Learning the Skill of Mutual Fund Managers 
    with R. Kaniel, Z. Lin and S. Van Nieuwerburgh 
    Journal of Financial Economics, 2023, 150(1), 94-138 
    Editor's Choice at the Journal of Financial Economics 
    Paper, Slides, Abstract
  • Deep Learning in Asset Pricing
    with L. Chen and J. Zhu
    Management Science, forthcoming 
    Best Paper Award at the Utah Winter Finance Conference 2020 
    Best Paper Award at the Asia-Pacific Financial Markets Conference 2020 
    CQA Academic Paper Competition, 2nd Prize, 2020 
    AQR Capital Insight Award, Honorable Mention, 2021 
    Best Paper IQAM Research Award 2022 
    Paper, Internet Appendix, Slides, Abstract
  • Asset Pricing and Investment with Big Data
    Machine Learning in Financial Markets: A Guide to Contemporary Practice, Cambridge University Press, forthcoming 
    Paper, Abstract
  • Factors that Fit the Time-Series and Cross-Section of Stock Returns
    with M. Lettau
    Review of Financial Studies, 2020, 33 (5), 2274-2325 
    Paper, Internet Appendix, Slides, Abstract
  • Understanding Systematic Risk: A High-Frequency Approach
    Journal of Finance, 2020, 75(4), 2179-2220 
    Paper, Internet Appendix, Abstract
  • Contingent Capital, Tail Risk, and Debt-Induced Collapse
    with N. Chen, P. Glasserman and B. Nouri 
    Review of Financial Studies, 2017, 30 (11), 3921-3969 
    Paper, Internet Appendix, Slides, Abstract
  • Optimal Stock Option Schemes for Managers
    with A.Chen
    Review of Managerial Science, 2014, 8(4), 437-464 
    Paper, Abstract
  • New Performance-Vested Stock Option Schemes
    with A. Chen and K. Sandmann
    Applied Financial Economics, 2013, 23(8), 709-727 
    Paper, Abstract

Statistics and Econometrics

  • Causal Inference for Large Dimensional Non-Stationary Panels with Two-Way Endogenous Treatment 
    with J. Duan and R. Xiong
    Slides, Abstract
  • Spanning the Option Price Surface 
    with D. Filipovic, S. Lerner and X. Ping
    Slides, Abstract
  • Automatic Outlier Rectification via Optimal Transport 
    with J. Blanchet., J. Li and G. Zanotti
    Abstract
  • Inference for Large Panel Data with Many Covariates 
    with J. Zou
    Paper, Slides, Abstract
  • Change-Point Testing and Estimation for Risk Measures in Time Series 
    with L. Fan and P. Glynn
    Journal of Financial Econometrics, Revise & Resubmit
    Paper, Abstract
  • Bayesian Imputation of Missing Data with Optimal Look-Ahead-Bias and Variance Tradeoff 
    with J. Blanchet, F. Hernandez, V. A. Nguyen and X. Zhang
    Management Science, Reject & Resubmit
    Paper, Abstract
  • Target PCA: Transfer Learning Large Dimensional Panel Data  
    with J. Duan and R. Xiong
    Journal of Econometrics, forthcoming
    Paper, Slides, Abstract
  • Large Dimensional Latent Factor Modeling with Missing Observations and Applications to Causal Inference 
    with R. Xiong
    Journal of Econometrics, 2023, 233(1), 271-301 
    George Nicholson Best Student Paper Finalist at INFORMS 2019, Faculty Co-author 
    Paper, Internet Appendix, Slides, Abstract
  • State-Varying Factor Models of Large Dimensions 
    with R. Xiong
    Journal of Business Economics & Statistics, 2022, 40(3), 1315-1333
    Paper, Internet Appendix, Slides, Abstract
  • Interpretable Sparse Proximate Factors for Large Dimensions 
    with R. Xiong
    Journal of Business Economics & Statistics, 2022, 40(4), 1642-1664
    Paper, Slides, Abstract
  • A Simple Method for Predicting Covariance Matrices of Financial Returns 
    with K. Johansson, M., G. Ogut, T. Schmelzer and S. Boyd
    Foundations and Trends in Econometrics, forthcoming
    Paper, Abstract
  • Estimating Latent Asset-Pricing Factors 
    with M. Lettau
    Journal of Econometrics, 2020, 218(1), 1-31 
    Dennis Aigner Award of the Journal of Econometrics, 2021
    Paper, Internet Appendix, Slides, Abstract
  • TextGNN: Improving Text Encoder via Graph Neural Network in Sponsored Search
    with J. Zhu, Y. Cui, Y. Liu, H. Sun, X. Li, L. Zhang, T. Yan, R. Zhang and H. Zhao
    The Web Conference 2021 (WWW '21)
    Paper, Abstract
  • Large-Dimensional Factor Modeling Based on High-Frequency Observations 
    Journal of Econometrics, 2019, 208 (1), 23-42
    Paper, Internet Appendix, Slides, Abstract
  • On the Existence of Sure Profits via Flash Strategies 
    with C. Fontana and E. Platen
    Journal of Applied Probability, 2019, 56(2), 384-397
    Paper, Abstract

All Papers

Publications:

Working Papers:

Work in Progress:

  • Term Structure of Characteristic-Sorted Portfolios and Multi-Horizon Investment (with S. Bryzgalova and S. Kozak)
  • Bridging the Yield Gap (with D. Filipovic and R. Wang)
  • International Yield Curves (with N. Camenzind, D. Filipovic and R. Wang)
  • Spanning the Option Price Surface (with D. Filipovic, S. Lerner and X. Ping)
  • A Universal Factor Model for Equities and Derivatives (with D. Filipovic, S. Lerner and X. Ping)
  • Do Algorithmic Traders Lead to Market Instability? A Multi-Agent Reinforcement Learning Approach (with Y. Fan and X. Yu)
  • Machine-learning the Skill of Bond Fund Managers (with R. Kaniel, S. Van Nieuwerburgh and L. Zhou)
  • How Much Sustainability is Really in Stock Prices? (with E. Archetti, E. Lütkebohmert-Holz and M. Rockel)
  • The Microstructure of Cryptocurrency Markets: Men vs. Machine (with G. Zanotti)
  • Large Dimensional Change Point Detection (with Y. Fan and J. Zou)

Abstracts

  • Deep Learning in Asset Pricing
    We use deep neural networks to estimate an asset pricing model for individual stock returns that takes advantage of the vast amount of conditioning information, while keeping a fully flexible form and accounting for time-variation. The key innovations are to use the fundamental no-arbitrage condition as criterion function, to construct the most informative test assets with an adversarial approach and to extract the states of the economy from many macroeconomic time series. Our asset pricing model outperforms out-of-sample all benchmark approaches in terms of Sharpe ratio, explained variation and pricing errors and identifies the key factors that drive asset prices.

  • Missing Financial Data
    We document the widespread nature and structure of missing observations of firm fundamentals and show how to systematically deal with them. Missing financial data affects over 70% of firms that represent about half of the total market cap. Firm fundamentals have complex systematic missing patterns, invalidating traditional ad-hoc approaches to imputation. We propose a novel imputation method to obtain a fully observed panel of firm fundamentals, that exploits both time-series and cross-sectional dependency of data to impute their missing values, and allows for general systematic patterns of missingness. We document important implications for risk premia estimates, cross-sectional anomalies, and portfolio construction.

  • Forest through the Trees: Building Cross-Sections of Stock Returns
    We build cross-sections of asset returns for a given set of characteristics, that is, managed portfolios serving as test assets, as well as building blocks for tradable risk factors. We use decision trees to endogenously group similar stocks together by selecting optimal portfolio splits to span the Stochastic Discount Factor, projected on individual stocks. Our portfolios are interpretable and well diversified, reflecting many characteristics and their interactions. Compared to combinations of dozens (even hundreds) of single/double sorts, as well as machine learning prediction-based portfolios, our cross-sections are low-dimensional yet have up to three times higher out-of-sample Sharpe ratios and alphas.

  • Machine-Learning the Skill of Mutual Fund Managers
    We show, using machine learning, that fund characteristics can consistently differentiate high from low-performing mutual funds, before and after fees. The outperformance persists for more than three years. Fund momentum and fund flow are the most important predictors of future risk-adjusted fund performance, while characteristics of the stocks that funds hold are not predictive. Returns of predictive long-short portfolios are higher following a period of high sentiment. Our estimation with neural networks enables us to uncover novel and substantial interaction effects between sentiment and both fund flow and fund momentum.

  • Asset Pricing and Investment with Big Data
    We survey the most recent advances of using machine learning methods to explain differences in expected asset returns and form profitable portfolios. We discuss how to build better machine learning estimators by incorporating economic structure in the form of a no-arbitrage model. A no-arbitrage constraint in the objective function helps estimating asset pricing models in spite of the low signal-to-noise ratio in financial return data. We show how to include this economic constraint in large dimensional factor models, deep neural networks and decision trees. The resulting models strongly outperform conventional machine learning models in terms of Sharpe ratios, explained variation and pricing errors.

  • Large Dimensional Latent Factor Modeling with Missing Observations and Applications to Causal Inference
    This paper develops the inferential theory for latent factor models estimated from large dimensional panel data with missing observations. We propose an easy-to-use all-purpose estimator for a latent factor model by applying principal component analysis to an adjusted covariance matrix estimated from partially observed panel data. We derive the asymptotic distribution for the estimated factors, loadings and the imputed values under an approximate factor model and general missing patterns. The key application is to estimate counterfactual outcomes in causal inference from panel data. The unobserved control group is modeled as missing values, which are inferred from the latent factor model. The inferential theory for the imputed values allows us to test for individual treatment effects at any time under general adoption patterns where the units can be affected by unobserved factors.

  • Target PCA: Transfer Learning Large Dimensional Panel Data
    This paper develops a novel method to estimate a latent factor model for a large target panel with missing observations by optimally using the information from auxiliary panel data sets. We refer to our estimator as target-PCA. Transfer learning from auxiliary panel data allows us to deal with a large fraction of missing observations and weak signals in the target panel. We show that our estimator is more efficient and can consistently estimate weak factors, which are not identifiable with conventional methods. We provide the asymptotic inferential theory for target-PCA under very general assumptions on the approximate factor model and missing patterns. In an empirical study of imputing data in a mixed-frequency macroeconomic panel, we demonstrate that target-PCA significantly outperforms all benchmark methods.

  • Factors that Fit the Time Series and Cross-Section of Stock Returns
    We propose a new method for estimating latent asset pricing factors that fit the time-series and cross-section of expected returns. Our estimator generalizes Principal Component Analysis (PCA) by including a penalty on the pricing error in expected returns. Our approach finds weak factors with high Sharpe-ratios that PCA cannot detect. We discover five factors with economic meaning that explain well the cross-section and time-series of characteristic-sorted portfolio returns. The out-of-sample maximum Sharpe-ratio of our factors is twice as large as with PCA with substantially smaller pricing errors. Our factors imply that a significant amount of characteristic information is redundant.

  • Estimating Latent Asset-Pricing Factors
    We develop an estimator for latent factors in a large-dimensional panel of financial data that can explain expected excess returns. Statistical factor analysis based on Principal Component Analysis (PCA) has problems identifying factors with a small variance that are important for asset pricing. We generalize PCA with a penalty term accounting for the pricing error in expected returns. Our estimator searches for factors that can explain both the expected return and covariance structure. We derive the statistical properties of the new estimator and show that our estimator can find asset-pricing factors, which cannot be detected with PCA, even if a large amount of data is available. Applying the approach to portfolio data we find factors with Sharpe-ratios more than twice as large as those based on conventional PCA and with smaller pricing errors.

  • State-Varying Factor Models of Large Dimensions
    This paper develops an inferential theory for state-varying factor models of large dimensions. Unlike constant factor models, loadings are general functions of some recurrent state process. We develop an estimator for the latent factors and state-varying loadings under a large cross-section and time dimension. Our estimator combines nonparametric methods with principal component analysis. We derive the rate of convergence and limiting normal distribution for the factors, loadings and common components. In addition, we develop a statistical test for a change in the factor structure in different states. We apply the estimator to U.S. Treasury yields and S&P500 stock returns. The systematic factor structure in treasury yields differs in times of booms and recessions as well as in periods of high market volatility. State-varying factors based on the VIX capture significantly more variation and pricing information in individual stocks than constant factor models.

  • Interpretable Sparse Proximate Factors for Large Dimensions
    This paper proposes sparse and easy-to-interpret proximate factors to approximate statistical latent factors. Latent factors in a large-dimensional factor model can be estimated by principal component analysis (PCA), but are usually hard to interpret. We obtain proximate factors that are easier to interpret by shrinking the PCA factor weights and setting them to zero except for the largest absolute ones. We show that proximate factors constructed with only 5-10% of the data are usually sufficient to almost perfectly replicate the population and PCA factors without actually assuming a sparse structure in the weights or loadings. Using extreme value theory we explain why sparse proximate factors can be substitutes for non-sparse PCA factors. We derive analytical asymptotic bounds for the correlation of appropriately rotated proximate factors with the population factors. These bounds provide guidance on how to construct the proximate factors. In simulations and empirical analyses of financial portfolio and macroeconomic data we illustrate that sparse proximate factors are close substitutes for PCA factors with average correlations of around 97.5%, while being interpretable.

  • Understanding Systematic Risk - A High-Frequency Approach
    Based on a novel high-frequency data set for a large number of firms, I estimate the time-varying latent continuous and jump factors that explain individual stock returns. The factors are estimated using principal component analysis applied to a local volatility and jump covariance matrix. I find four stable continuous systematic factors, which can be well-approximated by a market, oil, finance, and electricity portfolio, while there is only one stable jump market factor. The exposure of stocks to these risk factors and their explained variation is time-varying. The four continuous factors carry an intraday risk premium that reverses overnight.

  • Large-dimensional factor modeling based on high-frequency observations
    This paper develops a statistical theory to estimate an unknown factor structure based on financial high-frequency data. We derive an estimator for the number of factors and consistent and asymptotically mixed-normal estimators of the loadings and factors under the assumption of a large number of cross-sectional and high frequency observations. The estimation approach can separate factors for continuous and rare jump risk. The estimators for the loadings and factors are based on the principal component analysis of the quadratic covariation matrix. The estimator for the number of factors uses a perturbed eigenvalue ratio statistic. In an empirical analysis of the S&P 500 firms we estimate four stable continuous systematic factors, which can be approximated very well by a market and industry portfolios. Jump factors are different from the continuous factors.

  • Contingent Capital, Tail Risk, and Debt-Induced Collapse
    Contingent capital in the form of debt that converts to equity as a bank approaches financial distress offers a potential solution to the problem of banks that are too big to fail. This paper studies the design of contingent convertible bonds and their incentive effects in a structural model with endogenous default, debt rollover, and tail risk in the form of downward jumps in asset value. We show that once a firm issues contingent convertibles, the shareholders’ optimal bankruptcy boundary can be at one of two levels: a lower level with a lower default risk or a higher level at which default precedes conversion. An increase in the firm’s total debt load can move the firm from the first regime to the second, a phenomenon we call debt-induced collapse because it is accompanied by a sharp drop in equity value. We show that setting the contractual trigger for conversion sufficiently high avoids this hazard. With this condition in place, we investigate the effect of contingent capital and debt maturity on capital structure, debt overhang, and asset substitution. We also calibrate the model to past data on the largest U.S. bank holding companies to see what impact contingent convertible debt might have had under the conditions of the financial crisis.

  • Deep Learning Statistical Arbitrage
    Statistical arbitrage exploits temporal price differences between similar assets. We develop a comprehensive conceptual framework for statistical arbitrage and a novel data driven solution. First, we construct arbitrage portfolios of similar assets as residual portfolios from conditional latent asset pricing factors. Second, we extract their time series signals with a powerful machine-learning time-series solution, a convolutional transformer. Lastly, we use these signals to form an optimal trading policy, that maximizes risk-adjusted returns under constraints. Our comprehensive empirical study on daily US equities shows a high compensation for arbitrageurs to enforce the law of one price. Our arbitrage strategies obtain consistently high out-of-sample mean returns and Sharpe ratios, and substantially outperform all benchmark approaches.

  • Stripping the Discount Curve - a Robust Machine Learning Approach
    We introduce a robust, flexible and easy-to-implement method for estimating the yield curve from Treasury securities. Our non-parametric method learns the discount curve in a function space that we motivate by economic principles. We show in an extensive empirical study on U.S. Treasury securities, that our method strongly dominates all parametric and non-parametric benchmarks. It achieves substantially smaller out-of-sample yield and pricing errors, while being robust to outliers and data selection choices. We attribute the superior performance to the optimal trade-off between flexibility and smoothness, which positions our method as the new standard for yield curve estimation.

  • Shrinking the Term Structure
    We develop a conditional factor model for the term structure of Treasury bonds, which unifies non parametric curve estimation with cross-sectional asset pricing. Our factors are investable portfolios and estimated with cross-sectional ridge regressions. They correspond to the optimal non parametric basis functions that span the discount curve and are based on economic first principles. Cash flows are covariances, which fully explain the factor exposure of coupon bonds. Empirically, we show that four factors explain the discount bond excess return curve and term structure premium, which depends on the market complexity measured by the time-varying importance of higher order factors. The fourth term structure factor capturing complex shapes of the term structure premium is a hedge for bad economic times and pays off during recessions.

  • Asset-Pricing Factors with Economic Targets
    We propose a novel method to estimate latent asset-pricing factors that incorporate economic structure. Our estimator generalizes principal component analysis by including economically motivated cross-sectional and time-series moment targets that help to detect weak factors. Cross-sectional targets may capture monotonicity constraints on the loadings of factors or their correlation with fundamental macroeconomic innovations. Time-series targets may reward explaining expected returns or reducing mispricing relative to a benchmark reduced-form model. In an extensive empirical study, we show that these targets nudge risk factors to better span the pricing kernel, leading to substantially higher Sharpe ratios and lower pricing errors than conventional approaches.

  • On the Existence of Sure Profits via Flash Strategies
    We introduce and study the notion of sure profits via flash strategies, consisting of a high-frequency limit of buy-and-hold trading strategies. In a fully general setting, without imposing any semimartingale restriction, we prove that there are no sure profits via flash strategies if and only if asset prices do not exhibit predictable jumps. This result relies on the general theory of processes and provides the most general formulation of the well- known fact that, in an arbitrage-free financial market, asset prices (including dividends) should not exhibit jumps of a predictable direction or magnitude at predictable times. We furthermore show that any price process is always right-continuous in the absence of sure profits. Our results are robust under small transaction costs and imply that, under minimal assumptions, price changes occurring at scheduled dates should only be due to unanticipated information releases.

  • Inference for Large Panel Data with Many Covariates
    This paper proposes a novel testing procedure for selecting a sparse set of covariates that explains a large dimensional panel. Our selection method provides correct false detection control while having higher power than existing approaches. We develop the inferential theory for large panels with many covariates by combining post-selection inference with a novel multiple testing adjustment. Our data-driven hypotheses are conditional on the sparse covariate selection. We control for family-wise error rates for covariate discovery for large cross-sections. As an easy-to-use and practically relevant procedure, we propose Panel-PoSI, which combines the data-driven adjustment for panel multiple testing with valid post-selection p-values of a generalized LASSO, that allows us to incorporate priors. In an empirical study, we select a small number of asset pricing factors that explain a large cross-section of investment strategies. Our method dominates the benchmarks out-of-sample due to its better size and power.

  • Change-Point Testing for Risk Measures in Time Series
    We propose novel methods for change-point testing for nonparametric estimators of expected shortfall and related risk measures in weakly dependent time series. We can detect general multiple structural changes in the tails of marginal distributions of time series under general assumptions. Self-normalization allows us to avoid the issues of standard error estimation. The theoretical foundations for our methods are functional central limit theorems, which we develop under weak assumptions. An empirical study of S&P 500 and US Treasury bond returns illustrates the practical use of our methods in detecting and quantifying market instability via the tails of financial time series.

  • Bayesian Imputation with Optimal Look-Ahead-Bias and Variance Tradeoff
    Missing time-series data is a prevalent problem in finance. Imputation methods for time- series data are usually applied to the full panel data with the purpose of training a model for a downstream out-of-sample task. For example, the imputation of missing returns may be applied prior to estimating a portfolio optimization model. However, this practice can result in a look- ahead-bias in the future performance of the downstream task. There is an inherent trade-off between the look-ahead-bias of using the full data set for imputation and the larger variance in the imputation from using only the training data. By connecting layers of information revealed in time, we propose a Bayesian consensus posterior that fuses an arbitrary number of posteriors to optimally control the variance and look-ahead-bias trade-off in the imputation. We derive tractable two-step optimization procedures for finding the optimal consensus posterior, with Kullback-Leibler divergence and Wasserstein distance as the measure of dissimilarity between posterior distributions. We demonstrate in simulations and an empirical study the benefit of our imputation mechanism for portfolio optimization with missing returns.

  • Causal Inference for Large Dimensional Non-Stationary Panels with Two-Way Endogenous Treatment
    We propose a novel method for estimating a latent factor structure with non-stationary two-way fixed effects in large dimensional panels with missing observations. We allow for general missing observations that can depend on the two-way fixed effects and the latent factor model. We show the consistency and asymptotic normality of the estimator under general assumptions. The generality of our framework is particularly relevant for causal inference in panels, where the unobserved counterfactual outcomes can be modeled as missing values. The combination of two-way endogenous treatment effects, potentially nonstationary time trends and a general latent factor structure make our method broadly applicable for causal inference in panels. In multiple causal applications, we demonstrate that our approach can lead to different and more credible economic conclusions compared to its special cases of conventional difference-in-differences and PCA.

  • Spanning the Option Price Surface
    We propose a robust and flexible method for fitting the price surface of options. Our method leverages Reproducing Kernel Hilbert Spaces to learn a non-parametric modulation of a specified risk-neutral pricing measure which optimally trades off minimizing pricing errors with deviations from the specified distribution. We show empirically that our method outperforms parametric and naive kernel smoothing approaches out-of-sample on pricing index options. Our method is simple to implement, and the inherent linearity of our approach enables the efficient construction of hedging portfolios, which substantially outperform standard delta-hedging. We extend our method to price a cross-section of underlying stocks, enabling the accurate pricing of options for stocks without many observed options prices.

  • Automatic Outlier Rectification via Optimal Transport
    We propose a novel conceptual framework to detect outliers using optimal transport. Conventional outlier detection approaches typically use a two-stage procedure: first, outliers are detected and removed, and then estimation is performed on the cleaned data. Such an approach is not optimal as it the outlier removal is not targeted to the estimation task. We propose a better automatic outlier rectification mechanism that integrates rectification and estimation within a joint optimization framework. The key idea is to utilize the optimal transport distance with a concave cost function to construct a rectification set in the space of probability distributions. Then, we select the best distribution within the rectification set to perform the estimation task. Notably, the concave cost function promotes a phenomenon known as "long hauls", in which the optimal transport plan moves only a portion of the data to a distant location, while leaving the rest of the data unchanged. This allows our rectification set to efficiently correct the data and automatically detect outliers during the optimization process. We demonstrate the effectiveness and superiority of our approach over conventional approaches in extensive simulation and empirical analyses for mean estimation, least absolute regression, and the fitting of option implied volatility surfaces.

  • TextGNN: Improving Text Encoder via Graph Neural Network in Sponsored Search
    Text encoders based on C-DSSM or transformers have demonstrated strong performance in many Natural Language Processing (NLP) tasks. Low latency variants of these models have also been developed in recent years in order to apply them in the field of sponsored search which has strict computational constraints. However these models are not the panacea to solve all the Natural Language Understanding (NLU) challenges as the pure semantic information in the data is not sufficient to fully identify the user intents. We propose the TextGNN model that naturally extends the strong twin tower structured encoders with the complementary graph information from user historical behaviors, which serves as a natural guide to help us better understand the intents and hence generate better language representations. The model inherits all the benefits of twin tower models such as C-DSSM and TwinBERT so that it can still be used in the low latency environment while achieving a significant performance gain than the strong encoder-only counterpart baseline models in both offline evaluations and online production system. In offline experiments, the model achieves a 0.14% overall increase in ROC-AUC with a 1% increased accuracy for long-tail low-frequency Ads, and in the online A/B testing, the model shows a 2.03% increase in Revenue Per Mille with a 2.32% decrease in Ad defect rate.

  • A Simple Method for Predicting Covariance Matrices of Financial Returns
    We consider the well-studied problem of predicting the timevarying covariance matrix of a vector of financial returns. Popular methods range from simple predictors like rolling window or exponentially weighted moving average (EWMA) to more sophisticated predictors such as generalized autoregressive conditional heteroscedastic (GARCH) type methods. Building on a specific covariance estimator suggested by Engle in 2002, we propose a relatively simple extension that requires little or no tuning or fitting, is interpretable, and produces results at least as good as MGARCH, a popular extension of GARCH that handles multiple assets. To evaluate predictors we introduce a novel approach, evaluating the regret of the log-likelihood over a time period such as a quarter. This metric allows us to see not only how well a covariance predictor does over all, but also how quickly it reacts to changes in market conditions. Our simple predictor outperforms MGARCH in terms of regret. We also test covariance predictors on downstream applications such as portfolio optimization methods that depend on the covariance matrix. For these applications our simple covariance predictor and MGARCH perform similarly.

  • Optimal Stock Option Schemes for Managers
    This paper analyzes which stock option scheme best aligns the interests of a firm’s manager and shareholders when both are risk-averse. We consider granting to the manager a basic fixed salary and one of the following four options: European, Parisian, Asian and American options. Choosing the strike of the options optimally, the shareholders can mostly implement a first best solution with all payoff schemes. The American option scheme best aligns the interests of the manager and the shareholders in the most common case in which the strike price equals the grant-date fair market value.

  • New Performance-Vested Stock Option Schemes
    In the present article, we analyze two effective nontraditional performance-based stock option schemes which we call Parisian and constrained Asian executives' stock option plans. Both options have a criterion on the terminal value similar to a call option, but in addition impose a restriction on the path of the firm's assets process. Under a Parisian option scheme, the bonus of the executives becomes effective when the stock price has outperformed a certain threshold for a fixed length of time. Under the constrained Asian scheme, the executives' compensation is coupled with the average performance of the stock price. We show that the value of both Executives' Stock Option (ESO) schemes are less sensitive to changes in risk than plain vanilla options and hence represent an alternative compensation scheme that could make exaggerated risk taking through the executives less likely.