Homework 2

Instructions

Provide answers to the the following questions. If the question asks for an SQL query, give the query and the resulting table. If the table has more than 6 rows, just show the first 6. If you cannot figure out how to write a particular SQL query, you can get partial credit by writing a query that does part of the requested task. (In this case, clearly state that you are submitting a partial solution.) Per the norm, research on the internet is fine. But if you directly use information from the internet, properly cite your source.

You may submit your answers to the questions below as a single document or several documents. If there are several, be sure they are obviously named. All documents must be either in text of pdf format.

Part 1: Queries in the UNIV database

  1. Find the number of people who have taken each course in the comp sci department. Show only the course name and count. Sort this list by course name.
    1. Write this query without using the SQL JOIN clause
    2. Write this query using the SQL join clause
    3. Describe the advantages / disadvantages of each of your queries. (Approximately one paragraph)
  2. What section of what course had the most students in each semester? (Show all 20 rows in your answer)
  3. What course had the most total students across all sections in all years? Do this without using either "order" or "limit". This query should return exactly one record.
  4. GPA at this university is defined to be function of the graded earned and the credits for each course. For example suppose, the grades earned in two classes are A and B, and those two classes are worth 2 and 5 credits respectively. Further suppose that A has a value of 4.0 and B has a value of 3.0. Then the GPA calculation is (4.0*2 + 3.0*5)/(2+5)=(8+15)/7=3.29. Show the name, student id and GPA of the three students that have the lowest GPA. (There is a grade_points table that gives letter grade to numeric mapping)
  5. Building from the previous query, in a single table, show the students with the three lowest, and 3 highest, GPAs. Order the resulting table by student name.
  6. Find the IDs of students who did not take a class before 2005 (but who have taken a course after 2005). Sort this list by ID and show only the top 5.
  7. Find the name of the instructor and salary of the highest paid instructor in all departments such that the department name starts with G
  8. Find the name of the the instructor, and that person's salary and department, such that they have the highest salary in their department, but a lower salary than that of the highest paid person in any other department. (That is, find the lowest salary from among the highest salaries.) This query should return only one record.
  9. Extra Credit: In the table definitions is buried a hint about what University this DB is based upon. Find the hint and name the university.

Part 2: Queries in the Rocket database

The questions below assume you are using the rocket3 database. (The rocket database does not have the launch_nonull table.) Both of the questions about this database require written answers, not just SQL queries.
  1. The following two queries both find all launches such that the launch apogee was higher than the launch apogee of any launch in 1957. The first is 15x slower than second!!! Explain.
                select apogee, date from launch where apogee > all ( select apogee from launch where date_part('year', date)=1957);
                select apogee, date from launch where apogee > ( select max(apogee) from launch where date_part('year', date)=1957);
            
  2. Consider a query in which you want to join the launch table to the launch_nonull table on the criteria launch_nonull.mission=launch.mission and launch_nonull.tag=launch.tag. Try every variation of join except the "cross join" (inner join, natural join, right outer join, ....). If I detect that you have use a cross join, you will receive a 0 on this assignment. Report the number of rows in the resulting table for each different join. Explain the differences between the sizes of the tables. (You may need to do some other queries to give good explanations.) Also, describe -- in detail -- why anyone using a cross join on this part of this assignment will receive a 0.

Part 3: Queries in the Sakila database

  1. Find the actor first name, last name and number of movies made, for the actor appearing in the most movies (one row only).
  2. Find all movies in which an actor with the last name GUINESS appeared with an actor last name DAVIS
  3. Find the maximum number of times two actor appeared together in a movie. (For instance, in actual movies, Tom Hanks and Meg Ryan were in 4 movies (assuming you count Ithaca)). Return a single row which gives this number and the first and last names of the two actors.
  4. Find the ids of three actors that appear together in a film 3 or more times
  5. Find the names of all films in which actors with ids: 22, 102 and 159 all appear.
  6. Find the first name and last name of all customers who rented the movie 'ACE GOLDFINGER'

Part 4

Discuss: null values in a database require ternary logic; which is a pain. So why should you you allow them (give at least two reasons)? What are the alternatives (describe at least one)? Is your alternative to nulls better or worse, why?

How to submit

Use the process for submission from HW1 except put your document(s) in a directory named, for example, hw2. The execute
    /home/gtowell/bin/submit -c 383 -d hw2 -p 2 
There may be other versions of submit on the servers. Be sure to use mine. The arguments to submit have the following meanings: You may submit as many times as you like.