Horvitz–Thompson and Weighted Least Squares
Inverse probability weighting (IPW) is a popular tool for estimating the average treatment effect (ATE) of a binary variable under the conditional ignorability assumption. There are multiple variants of IPW. Particularly, I used to be intrigued by the relationship between the so-called Horvitz–Thompson (HT) estimator and the Weighted Least Squares (WLS) estimator, both of which implement IPW. Both estimators are introduced in popular causal inference textbooks. For example, Angrist and Pischke (2009, p.82) focus on the HT estimator, while Winship and Morgan (2014, p228-9) focus on the WLS estimator. However, I haven’t seen them introduced together in these textbooks, and their relationship could seem a little unclear. In fact, a quick simulation would reveal that the classic version of the HT estimator gives a different ATE estimate from that of the WLS estimator in any finite sample (although asymptotically they both converge to the true ATE).