For this assignment, you will
- (Part 1)
- choose a data set,
- design a SQL data model for it,
- pose questions that can be answered using this data
- (Part 2)
- create the tables in SQLite,
- load data, and
- answer your questions.
Working with a partner
You have the option to work in a group of two, but this is not required or expected. If you do work with a partner, then we will expect a more complex project. Please see the Part 1 description for more details. The non-submitting partner should leave a comment in Canvas saying who they worked with.
Part 2: Database implementation
- Write SQL code to define your tables in SQLite. This will be a series of CREATE TABLE commands that can be run in the DB Browser for SQLite program. Please review Lecture 11 and the online SQLite documentation and tutorials. Please remember to:
- Define primary keys
- Define foreign keys (using the “references” keyword)
- Specify “NOT NULL” for columns that are required (most, if not all, columns)
- Add UNIQUE keys where necessary
- Set proper column types (integer, float, text, etc.)
- Load data into your tables in one of three ways:
- Download one or more CSV files and load them using the "import table" feature of DB Browser for SQLite. You may need to load the data into temporary tables and then use queries to fill the permanent tables. Record all the steps you followed to build the database.
- Write code in a general-purpose programming language (like R, Python, or Matlab) to read data from a source file, connect to your SQLite database, and insert the data. This option will be necessary if you are reading data from JSON or XML files, scraping data from a web page, getting data from a web API, or if you choose to randomly generate some data.
- You may find it necessary to manually write SQL code to insert data into some of your smaller tables. This will be a series of INSERT INTO commands. However, as I said before, the bulk of your data import should be done in an automated fashion, not by typing out hundreds of INSERTS.
You will turn in the resulting sqlite database file (.sqlite or .db) AND some combination of imported CSV files, SQL code, or code in another programming language (plus the source data files). Also, include a README.txt file explaining exactly what steps you followed to create the database. You should give enough details for the grader to reproduce the database from scratch, if necessary.
Answer your own questions that you wrote for Part 1. You will turn in both the SQL queries and the results.
Please include your updated Part 1 diagram in your final submission. I say "updated" because you may have to make some adjustments or corrections. Even if you do not make any changes, please include a copy of the diagram.
Combine all of the above in a zip file submission. NOTE: Please "test" you zip file submission by copying it to an empty folder, unzipping it, and trying to open the *.sqlite or *.db file in DB Browser for SQLite.