Paper from 2019: Emerging the U.S. Firm Size Distribution Using 4.2 Billion Individual Tax Records

Joseph Shaheen RAIS Paper


The firm size distribution describes important economic and labor properties of any economy. Government entities must expend enormous resources in data collection, cleaning, and analysis in order to construct this and other important distributions describing the aggregate properties large economies. In the U.S., this process can be cumbersome and relies on querying multiple databases and utilizing significant computational resources.
I show that construction of the U.S. firm size distribution is plausible using only individual income tax records (W2s) drawn directly from Internal Revenue Service tax records (micro data) and that the emergent distribution is statistically identical to what is reported by the United States Census Bureau.
The methodology represents an incremental advance for population-scale studies in economic analysis, specifically firm and labor analysis. Finally, this paper acts as a re-validation of earlier work in fitting the firm size distribution.

Published by

Dr. Joseph A.E. Shaheen

Computational Social Scientist with a twist of network science, social network analysis, data science, and random thoughts.

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.