Project Resources

Tips on Approaching the Regression Project

There are many ways that you can approach your project. Two such ways you can start your project are with a data set and with an idea.

Starting with a Data Set

First you need to find your data. Here are some links that may prove to be useful:

Once you have your data, you should ask yourself, "What kind of comparisons can I make?" For example,

  • Compare attributes of different counties, states, countries or other geographical units.
  • Compare attributes of different companies.
  • Compare attributes of different people.


Don't forget to ask yourself, "What is the unit of analysis?"

Starting with an Idea

Your idea could come from many places, including:

Report Writing: Communicating Data Analysis Results



(1) Title and Abstract,
(2) Introduction,
(3) Data Characteristics,
(4) Model Selection and Interpretation
(5) Summary and Concluding Remarks
(6) References and Appendix.

Sections (1) and (2) serve as the preparatory material, designed to orient the reader.
Sections (3) and (4) form the main body of the report while
Sections (5) and (6) are parts of the ending.

Detailed points to include in each section

(1) Title and Abstract

Title should be concise and to the point.
Be sure to respond to such questions as:

  • What problem was studied?
  • How was it studied? Describe source of data and techniques used to solve the problem.
  • What were the findings?

(2) Introduction

As with the general report, the introduction should be partitioned into three sections:

  1. Orientation Material (describe purpose)
  2. Key Aspects of the Report
  3. A Plan of the Paper

(3) Data Characteristics
  • Describe the data provided and the data used in the analysis. Document source of the data.
  • In the data characteristics section, identify the nature of data (longitudinal versus cross-sectional, observational versus experimental, et cetera).
  • Quality of scatter plots to indicate primary relationships in cross-sectional data or time series plots to indicate most important trends in longitudinal data.
  • Quality of statistics to indicate primary relationships or to emphasize trends.
  • Do the plots, and concomitant summary statistics, serve to identify unusual points that are worthy of special consideration?
  • Are there too many statistics and plots in the main body?
  • Do the details presented in this section should foreshadow the development of the model in the subsequent section?
  • Do other salient features of the data may appear in the Appendix?

(4) Model Selection and Interpretation
  1. an outline of the section. Is the purpose behind this model selection clear (understanding versus forecasting)?
  2. a statement of the recommended model. Is the model reasonable?
  3. an interpretation of the model, parameter estimates and any broad implications of the model. Have the variables and coefficients of the model been adequately interpreted?
  4. the basic justifications of the model. Has the model been justified (coefficient of determination(R2), s, t-ratios, residual plots, out-of-sample validation)?
  5. an outline of a thought process that would lead up to this model.
  6. a discussion of alternative models. Have alternative models been explored (transforms, deleting parts of data, analysis of components)?

(5) Summary and Concluding Remarks
  • Rehash the results of the report in a concise fashion, in different words than the executive summary.
  • Comments on quality of data and reliability of concomitant inferential statements (what type of data would improve the reliability of your statements)?
  • Include ideas that you have about future investigations

(6) References and Appendix
  • Is there a good relationship between the discussion in the main report and the presentation in the Appendix?
  • Is each portion of the Appendix be clearly identified, especially with respect to its relation to the main body of the report.

Overall assessment of the quality of report
Does it answer the question?


Sample Projects from Students

These samples may give you a feel for what to include (or not) in your write-up. These are actual student reports. Naturally, this is a biased sample - these are top reports!


This project is from Math 238 taught at the University of Connecticut in Fall 2007


Determinants of Mutual Fund Performance by Nathan Rule, Miles Carpenter, and Thomas Murawski


These projects are from Actuarial Science 654 taught at the University of Wisconsin in Spring 2008.


Predicting Charitable Contributions by Lauren Meyer


Uninsured in the Modern Day by Missy Pinney and Yang Liu


Seasonal and Temporary Employment and Health Insurance Coverage by Matthew Mark


Predicting Late Debt Payments by Steve Minturn


Date: 25 October 2018