For this assignment, you will

Part 1: Design a data model

I want you to pick some business, social, scientific, or industrial phenomenon that is complex and which has some data publicly available.  To get ideas, I suggest that you browse the public data sets at:

Define a relational data model using the "tables and arrows" diagram style that I used in the Lectures 11 and 12. You must:

You must choose a data domain for which you can somehow get real data.  Your database must eventually have at least 200 rows in total.  However, your data loading process should be automated, so you should feel free to tackle data sets with hundreds of thousands or millions of rows.  Do not choose a data domain identical to one of the examples we already modeled in class.  Please also read the Part 2 instructions so that you choose a database that will work with Part 2.

Your Part 1 submission should include

All of the above should be in one PDF file.

One of the teaching staff will review and return your answers to Part 1 before you can continue to Part 2.  Ideally, you would come to office hours so that we can review your Part 1 in person.  Please turn in Part 1 as soon as possible so that you can start Part 2 sooner.

Working with a partner

You have the option to work in a group of two, but this is not required or expected.  If you do work with a partner, then we will expect a more complex project:

If you do work with a partner, then only one submission is required, which should list both people.  It should also list the contributions of both partners.  The non-submitting partner should leave a comment in Canvas saying who they worked with.

Grading Rubric

Part 2 of the final project is worth 10 points.  The basic criteria above will get you to 8 out of ten points (assuming there are no errors).  You can earn an extra point by doing something extra, like: