Final Project

Author
Affiliation

Alex Cardazzi

Old Dominion University

All materials can be found at alexcardazzi.github.io.

Project Summary

Students will be expected to carry out an econometric analysis on a topic of their choosing (to be approved by the instructor). The project should focus on identification and estimation. The goal of this project is to not only convince people that you know R and econometrics, but that you can also present and articulate technical concepts.

While students may use any method discussed in class, they may not use time series analysis. Unfortunately, we do not have time to study time series methods, so a project using such methods would be inappropriate. Put differently, students must use either cross-sectional or panel data (though panel is highly encouraged).

Students will submit checkpoints throughout the semester to ensure they are making progress on the project. However, as stated in the syllabus, students are offered “off-ramps” for Checkpoints 3-7. These off-ramp assignments take the place of the checkpoint, but move the student from the project track to an alternative track with a lower potential overall grade. Please see the syllabus for more information.

The expectations for each checkpoint and the final project are as follows.

Checkpoint 1

Create ePortfolio | 4 Points

For this checkpoint, students will have to do some set up for the rest of the semester. First, they will have to create an outline of an ePortfolio. Maximally, you should think of your ePortfolio as a personal website. Minimally, your ePortfolio will act as an online resume where you can show off your work. Second, students must must upload/embed a placeholder HTML file for their final project. This is important because students will need to do this at the end of the semester when submitting their final project. It is better to do this now, before things get too crazy.

ePortfolio

ODU supports student ePortfolios via the Office of Academic Success Initiatives & Support, and students are encouraged to explore these resources (e.g., they have a list of set up tutorials). I have personally used Google Sites in the past and found it to be both intuitive and functional. Other faculty members might suggest websites such as Wix. The two major advantages of these services are that they are “point and click” softwares that require no coding and you can continue using them even after you graduate.

Note: if you choose to use Google Sites (which, again, I think is a good idea), you will need to modify a setting to make your site available to the public. Once you create your site, click on the “share” button (it should look like a silhouette of a person with a “+” next to them), then under “Published Site”, click on “Public”. See below for a short .gif demonstration.

Project Placeholder

Once you have created your ePortfolio and downloaded R and RStudio (instructions will be given in class), you should generate and upload your project placeholder. To do so, download the project template, render it in RStudio, and upload/embed. Sometimes, this last part can be a bit of a pain for students. Note that you will first need to view and copy the HTML file’s underlying code (how to view underlying code, then ctrl+a to highlight all, then ctrl+c to copy). Then, you will need to paste this code into an “Embed” element in your ePortfolio (click for Google Sites; click for Wix). I will be happy to help troubleshoot via email or Zoom on any of these steps, so please reach out if you are having issues!

To receive full credit for this checkpoint, students must submit a working URL that navigates to an ePortfolio home page with a professional picture (or a placeholder image), their name, major, home town, graduation date, and small bio. Students must also upload/embed a placeholder HTML file for their final project.

Checkpoint 2

Topic(s) | 20 Points

Before submitting this checkpoint, Checkpoint 1 must have been both submitted and graded.

To receive full credit, students must submit multiple research topics they’re interested in. Multiple (3-5) topics are required to keep options open in case one falls through. In most instances, students should be testing the effect of a particular law, policy change, or event on an outcome of their choosing. Therefore, students should focus on both a general topic and/or some related event. Some examples of general topics might be: air quality, housing prices, sports, education, the wage gap, crime, life expectancy, traffic fatalities, etc. Some examples of events might be: ACA expansion, marijuana legalization, the rollout of same-sex marriage legalization, etc.

This might be a good opportunity to bounce some ideas off of AI. Tell AI that you need to write an econometrics paper using a causal inference method and that you’re interested in X, Y, and Z. You should also do a preliminary search for your topics on Google Scholar so you can see what other people have done, find some inspiration, etc. Be sure to include some of the most relevant papers in your submission. Note that your topic(s) not need be novel, as a replication/extension of another paper would be perfectly sufficient.

Some students may find it easier to start by looking at available data and then backing into a topic/event. Students who are interested in this strategy should explore the State Policy Database. This site contains detailed data on different state-level information by year including health, education, crime, and labor, and might be helpful for identifying different law/policy changes to study.

Checkpoint Example

I am broadly interested in the way changes in our society and culture interact and influence the cities in which we live. For example, I am interested in exploring the effects of same-sex marriage legalization on economic outcomes such as economic output, wages, housing prices, employment, etc.

Checkpoint 3

Research Question & Setting | 20 Points

Before submitting this checkpoint, Checkpoint 2 must have been both submitted and graded.

For this checkpoint, students are required to submit a specific research question with a clear, testable hypothesis. At this point, it is still OK if students have multiple research topics (and thus multiple questions). Gathering data is difficult, so having multiple possibilities will be for the better. Once again, looking to the academic literature for ideas/help is a good idea. This is also an instance where AI can be helpful in shaping and molding your ideas. After this checkpoint, though, AI’s usefulness will start to diminish.

Checkpoint Example

Narrowing down my previous topic, I would like to explore the effects of same-sex marriage legalization on the housing market. Same-sex marriage legalization could place upward pressure on housing prices due to a potential influx of same-sex couples to an area, a signal of progressiveism to like-minded individuals, etc. To study this, I will explore the setting of the United States prior to 2015. Dates of same-sex marriage laws by state are available online (and importantly, not all states passed the law at the same time) while Zillow contains housing price data for different cities/states across the country.

Alternative Assignment: The alternative assignment for this checkpoint is to record a presentation of a published academic paper that uses difference-in-differences. Students should find their paper on Google scholar, get it approved by the instructor, read it, and record themselves giving a 15 minute presentation. Students should format their presentation such that they cover the paper’s research question / motivation, data sources, empirical / identification strategy, assumptions, results, and a critical evaluation of the paper’s evidence / claims. A presentation template can be found here.

Checkpoint 4

Identification & Data Sources | 20 Points

Before submitting this checkpoint, Checkpoint 3 (not its alternative) must have been both submitted and graded.

Students must submit a clear, specific identification strategy and research design. This should feel similar to an imagination exercise, except a bit more grounded in reality. This is where students should tie their research question to an empirical identification strategy and propose data sources. This checkpoint, and all future checkpoints, should be submitted via the project template.

Checkpoint Example

To identify the effects of same-sex marriage legalization on housing prices, I will compare prices in states that passed the law to prices in states that did not (until 2015, when it became legal nationally) in a difference-in-differences framework. Then, a discussion about assumptions, threats to a causal interpretation, etc, followed by potential data sources. For example, housing price data could come from Zillow’s ZHVI dataset (include link and further description), legalization dates can come from a news article or website. Control variables could come from the census, etc.

Alternative Assignment: The alternative assignment for this checkpoint is to record a presentation of a published academic paper that uses matching or weighting for casual inference. Students should find their paper on Google Scholar, get it approved by the instructor, read it, and record themselves giving a 15 minute presentation. The recording should contain the slides and a visual of the student (basically, record a Zoom meeting with yourself). Students should format their presentation such that they cover the paper’s research question / motivation, data sources, empirical / identification strategy, assumptions, results, and a critical evaluation of the paper’s evidence / claims. A presentation template can be found here.

Checkpoint 5

Data Collection | 20 Points

Before submitting this checkpoint, Checkpoint 4 (not its alternative) must have been both submitted and graded.

To receive full credit for this checkpoint, students must finalize where their data is coming from, discuss the collection process, and show evidence that data has been collected or is being collected. Include this information in the project template so there is less to write later on.

Alternative Assignment: Complete an imagination exercise for Module 4: Matching.

Checkpoint 6

Data Summary | 20 Points

Before submitting this checkpoint, Checkpoint 5 (not its alternative) must have been both submitted and graded.

Students must submit a summary of the data they’ll be using in their analysis (including, but not limited to, a summary statistics table and some figures) built on top of what they have already written in previous checkpoints. Of course, the data must be appropriate for the research question and causal inference strategy. For example, if a student is choosing to use Synthetic Control, they must have panel data with only one or two treated units. Students should anticipate their sample sizes being in the hundreds or thousands unless approved by the instructor.

Alternative Assignment: The alternative assignment for this checkpoint is to record a presentation of a published academic paper that uses a synthetic control methodology. Students should find their paper on Google scholar, get it approved by the instructor, read it, and record themselves giving a 15 minute presentation. Students should format their presentation such that they cover the paper’s research question / motivation, data sources, empirical / identification strategy, assumptions, results, and a critical evaluation of the paper’s evidence / claims. A presentation template can be found here.

Checkpoint 7

Analysis | 20 Points

Before submitting this checkpoint, Checkpoint 6 (not its alternative) must have been both submitted and graded.

This checkpoint requires students to submit a draft of their analysis in the project template. This document should not only include an analysis, but also an outline of the rest of their final project. See below for more information.

Alternative Assignment: The alternative assignment for this checkpoint is to record a presentation of a published academic paper that uses an instrumental variable. Students should find their paper on Google scholar, get it approved by the instructor, read it, and record themselves giving a 15 minute presentation. Students should format their presentation such that they cover the paper’s research question / motivation, data sources, empirical / identification strategy, assumptions, results, and a critical evaluation of the paper’s evidence / claims. A presentation template can be found here.

Final Project

Final Project | 51 Points

Final projects must be rendered .html files that are uploaded to an ePortfolio. The project’s outline should be as follows:

  • Introduction: Motivate and introduce your research question. Convince the reader that this topic/question is important and that they should care. What data do you use? What causal inference strategy do you use to address this question? What do you find? What can we learn from this?
  • Literature Review: What work has already been done on this topic by others? What are their conclusions? How is your work different?
  • Data: Where does your data come from? Why is this data good for answering your question? Be sure to create, and discuss, a summary statistics table and some plots.
  • Empirical Strategy: Outline your identification strategy including your causal inference method. Provide a discussion about why you are using the method you’re using. What are the strengths and weaknesses? What are your assumptions? Generate the results, and interpret your findings. You might want to split this into two sections: Empirical Strategy and Results.
  • Conclusion: Remind the reader why your topic is important, what your research question is, what you do, and what you find. Be sure to include a discussion of the implications of your findings.

You may use this final project template to help you get started. For more help on what your project should look like, etc., check out this sample paper by Dr. Tomas Dvorak. This is an obviously slim version of what you would be generating, but the idea is the same.

Project Rubric

Structure

0 - 10 Points

  • The paper is organized into sections with appropriate (sub-)headings and free from formatting issues.
  • Writing is free of spelling and grammar mistakes. The style of writing is professional and/or that of an academic article.
  • Overall, the paper adheres to norms in the literature. Tables and figures are labeled, referenced works are cited appropriately, etc.

Econometric Analysis

0 - 31 Points

  • The research question, and its importance, is prominent and clear to the reader.
  • The hypotheses of the author are clear and supported by sound economic theory and/or logic.
  • Data sources are provided and discussed.
  • The data used is appropriate for addressing the research question and chosen identification strategy.
  • Summary statistics are presented and discussed.
  • Visualizations are labeled, informative, and discussed.
  • Causal Inference Technique and Identification:
    • The mechanics of the chosen method are explained in the context of the study.
    • The author is up front about the assumptions required. Be sure to critically evaluate how reasonable these assumptions are.
    • Why is this method appropriate for this project?
  • If applicable, the econometric model should be written out with a clear connection to the research question.
  • Results are generated appropriately, and presented in either a table, figure, or both.
  • Both the magnitudes and signs of the results should be interpreted. What are the weaknesses of the analysis? Are there any alternative explanations for your findings?
  • Technical concepts are effectively and efficiently communicated throughout the project.

Code

0 - 10 Points

All code is present, concise, commented, and neatly folded. A reader could easily replicate the project using what is described in the text and the code supplied.