Matching Markets II

Causal inference from endogenous groups

This post builds on my previous post on sorting bias in endogenous groups to ilustrate a method to draw causal inference from endogenously formed groups.

In my previous post, I had shown the endogeneity problem that arises from matching into groups and how it can be resolved by controlling for unobservables that affect both group formation and outcomes.

This post illustrates a method that helps to estimate these unobservales and, at the same time, provides a source of exogenous variation to identify the parameters in the group outcome equation. The method is applied in Klein (2015a) and documented in the vignette (Klein 2015b) to R package matchingMarkets (Klein 2015c).


The method relies on the assumption that groups are in equilibrum. In a non-transferable utility context with aligned preferences, this results in a very simple equilibrium condition for two-group markets: the group with the maximum group valuation must be in equilibrium (see Klein 2015a ).

This equilibrium condition can be restated in two simple inequalities:

  • lower bounds for equilibrium groups: the match valuation of one of the two equilibrium groups must larger than that of all non-equilibrium groups.
  • upper bounds for non-equilibrium groups: the match valuation of each non-equilibrium group must be lower than the maximum of the equilibrium groups.

I use the data augmentation approach, proposed by Albert and Chib (1993), which treats the latent valuations as parameters. The Gibbs sampling method for matching model V = alpha*W + eta iteratively simulates draws for the four parameter blocks:

  1. the match valuation V for non-equilibrium groups (black circles),
  2. the match valuation V for the first equilibrium group (red asterisk to right),
  3. the match valuation V for the second equilibrium group (red asterisk to left), and
  4. the regression coefficient alpha (blue dashed line).


The start values in the simulation above are: V = alpha = 0. The parameter draws for V are standard normal with the conditional mean given by the back, dashed line. Draws are restricted by the two inequalities derived from the equilibrium condition above:

  • grey shades indicate upper or lower bounds imposed on the current draw(s)
  • yellow shades indicate that no bounds are imposed because the valuation of one equilibrium group is higher than that of the highest non-equilibrium group (and therefore the equilibrium condition is met for any value of the draw).

See my GitHub page for the gif’s source code.


The Gibbs sampling algorithm is computationally complex but the bounds provide a valuable source of exogenous variation, which Klein (2015a) uses to correct for endogenous group formation.

The identifying exclusion restriction is that the equilibrium bounds depend on the characteristics of all agents in the market, but the performance of a matched group is determined only by its own members.


Albert, James H, and Siddhartha Chib. 1993. “Bayesian Analysis of Binary and Polychotomous Response Data.” Journal of the American Statistical Association 88 (422). Taylor & Francis: 669–79.

Klein, T. 2015a. Does anti-diversification pay? A one-sided matching model of microcredit. Cambridge Working Papers in Economics 1521. Faculty of Economics, University of Cambridge.

———. 2015b. Analysis of stable matchings in R: Package matchingMarkets. Vignette to R package matchingMarkets. The Comprehensive R Archive Network.

———. 2015c. matchingMarkets: Analysis of stable matchings. R package version 0.1-7. The Comprehensive R Archive Network.


comments powered by Disqus