# Introduction and Statement of Research Question

• Pose and motivate the question to be investigated with your data and econometric model.
o Good econometric questions are generally based on economic theory; however, econometrics can be used to analyze all kinds of cause-and-effect relationships even if they don’t directly relate to previous theory courses. You can study just about anything that interests you. Since the goal of the project is causal identification of the effect of one variable on another, a good question should specify the one primary X variable
and one primary Y variable that you are interested in.
1. Formulation of the Model
• Express the question both verbally and in the form of an equation to be estimated. Be sure to include the names and units of the dependent and independent variables.
• You must include at least three independent variables, but can include as many as you’d like beyond that.
• Explain the mechanisms you believe pertain to your study question and why you selected the particular X variables that you did and why you didn’t select others.
• You may also want to briefly cite relevant literature (e.g. previous studies) that you come across while researching your topic.
2. Data Description
• Describe the data used to estimate the model. Your data set must have at least 40 observations.
o You should construct a table that provides the mean, minimum, maximum and standard deviation (and any other summary statistics you feel are relevant) of all
variables in the model.
▪Describe the content of the table(s) in the text of your paper.
▪You might also want to show and discuss key scatterplots or graphs of interest.
• It is up to you as to where you find the data. Lots of economic data are available on the web (e.g. government agency websites).
• Your data does not need to be strictly economic though. o Look up Freakonomics and other work by Steve Levitt; you can study whatever interests you. If you like sports, pull some data off ESPN; if you like comic books, key in data from a price guide; if you raise horses, you may know where to get sales data; etc.
• You should not simply use data that came with the book.
• This is your opportunity to do something new and creative.
• You may want to focus on cross-sectional data since we discuss this most in class, however if you
do choose an alternate type of data analysis you should be careful in your write- up and
interpretation.
• To use STATA, it may be necessary to import your data from another format (e.g., Excel, ASCII,
CSV, etc.). To do this, use the steps we have followed in Problem sets). If data you’re interested hard to import into STATA, let me know so I can help you troubleshoot.
3. Empirical Results
• Present the results of the OLS estimation (and/or other appropriate technique learned in class) in the form of a table(s).
o Tables should be formatted as easy-to-read tables rather than cut and pasted out of a specific statistical program.
o You may want to, for example, examine your key parameter estimates with and without additional regressors to compare any differences (i.e., run a single regressor model and then a multiple regression model).
2
• You should show at least three specifications of your design. o These will likely include various non-linear versions of a baseline model, for example by adding a polynomial or interaction variable, or by creating relevant categorical or binary variables to analyze group effects.
• Explain the meaning of estimated coefficients in your model and whether or not coefficients are statistically significant and at what level.
• Also discuss the goodness of fit measure for the model (e.g. R-squared, SER, adjusted R-squared).
1. Summary and Discussion
o Carefully explain caveats to your study, especially whether or not certain parameter
Bibliography
estimates may be biased.
o Suggest what other independent variables, possible functional forms, or statistical tests
might be appropriate to include and any interesting follow-up questions or extensions that
may have come to mind.
o Address all issues of internal and external validity.
• All cited sources should be reported in a bibliography in MLA format. o One useful reference is your textbook.
• Also include anything relevant that is not within the main document at the end, such as attachments of key STATA output (Here I want regression or summary statistics outputs, and not a series of lines of code)
Formatting: