Open Source Cross-Sectional Asset Pricing


  1. They mention the p-hacking problem.
  2. They find that almost all characteristics can have significant performance after the adjustment (do the rank transformation)
  3. Provide the open-source database for future research.


They collect 319 firm characteristics and find that almost all can have good predictive power on monthly stock returns.

Data sample instruction

In this paper, they collect 319 characteristics from:

  1. McLean and Pontiff (2016)

  2. Green, Hand, and Zhang (2017)

  3. Harvey, Liu, and Zhu (2016)

and separate them into four categories.

They apply two criteria for the data computation.

  1. Require a six-month lag for the annual data and a quarter lag for quarterly variables, but they do not construct the variables at the end of June or December. They apply the monthly frequency, which means that the monthly variables will echo the most recent annual information with a six-month lag.
  2. All annual fiscal information is collected at the end of the published month, and it will be available at the end of six-month later.


Long-short portfolios to check the null hypothesis with zero mean.

From the results, they select 205 significant predictors to estimate the expected stock returns.


Green, Jeremiah, John R. M. Hand, and X. Frank Zhang. 2017. “The Characteristics That Provide Independent Information about Average U.S. Monthly Stock Returns.” The Review of Financial Studies 30 (12): 4389–4436. doi:10.1093/rfs/hhx019.
Harvey, Campbell R, Yan Liu, and Heqing Zhu. 2016. “… and the Cross-Section of Expected Returns.” The Review of Financial Studies 29 (1): 5–68. doi:10.1093/rfs/hhv059.
McLean, R David, and Jeffrey Pontiff. 2016. “Does Academic Research Destroy Stock Return Predictability?” The Journal of Finance 71 (1): 5–32. doi:10.1111/jofi.12365.

Open Source Cross-Sectional Asset Pricing
Peng Jiaxin
Posted on
July 13, 2022
Licensed under