SOI Tax Stats - The Work SOI Does


We work with numbers, plain and simple. We scrutinize data from individual, corporate, estate, and nonprofit returns, among others, to study and report on such things as:

  • Individual tax studies, including sources of income, exemptions, deductions, taxable income, income tax, tax credits, and tax payments
  • Sole proprietorships by industry, business receipts, deductions, and net income
  • Individuals’ sales of capital assets, use of paid preparers, use of particular forms, and use of medical savings accounts
  • Cohort panels of taxpayers used for economic modeling
  • Average and marginal tax rates and alternative income concepts
  • Tax analysis of high-income tax returns
  • Migration and geographic data, along with Americans living abroad
  • Corporate (including S-corporations) taxation classified by industries, accounting periods, and sizes of assets, receipts, and income taxes after credits
  • Partnership and limited liability company information
  • The assets, liabilities, income, deductions, earnings and profits, foreign taxes, and transactions of 7,500 foreign corporations controlled by U.S. parent corporations
  • Foreign tax credits, foreign sales, foreign trusts, and export-related data
  • International boycotts
  • Aspects of investment and activity in the United States by foreign persons
  • Tax-exempt bond issues

And then, we publish the data. On our products and services page you can find a list of these publications.

How We Do It All

We design a study, determine the sample size and selection criteria, capture the data, assign weights, and analyze sample variability. Then we publish the results.

Sample Design and Selection

Most SOI operations begin by sampling returns from the master file system compiled by IRS processing centers. We have computers in several of these centers dedicated to statistical processing. We generally compile statistics based on stratified probability samples of tax or information returns, using such classes as size of income, industrial activity, or the like.

Data Capture Techniques

Data items pulled electronically from the master file are augmented with items captured from the hardcopies of taxpayers' returns. To ensure that the statistics are consistent and reliable, SOI economists develop extensive on-line tests and error resolution procedures.

Weighting and Estimation

The IRS processes about 200 million tax returns each year, and SOI uses about half a million of them for our statistics. Although this represents a smaller volume, our samples include a higher fraction of complex returns and collect much greater item content. Thus, we provide weights to make estimates for the whole population of filers. Weights are computed by dividing the population count of returns filed for a given stratum by the count of sample returns for that same stratum.  The data on each return in a stratum are multiplied by the weight assigned for the given stratum.

Sampling Variability

The particular sample used in a study is only one of a large number of possible random samples that could have been selected using the same sample designPDF. We calculate the standard error of the estimate to measure the precision with which an estimate from a particular sample approximates the average result of the possible samples.


About SOI || The Purpose and Function of SOI || SOI Home