Part II
In this part, explain how you get your numbers by indicating the algorithm and formula used. Simply providing a number will not give you credit, even if the number is correct. Example: cost of joining R and S by hash join is 3(M+N) = 3(500 + 1500) = 6000, where M=500 is the size of R and N=1,500 is the size of S.
Assume a database with relations:
ACTOR(last-name, first-name, dob, sex, birthplace)
MOVIE(title, year, director-name, studio-name, budget, duration, ticket-sales)
APPEARS(actor-name, title, salary, role)
where last-name is the primary key of ACTOR, title is the primary key of MOVIE, and (actor-name, title) is the primary key of APPEARS (and both actor-name and title are foreign keys in APPEARS to ACTOR and MOVIE respectively). Note that this implies that APPEARS reflects a many-to-many relationship between actors and movies. director-name and studio-name are foreign keys to tables DIRECTOR and STUDIO, not shown. last-name and first-name are 32 bytes each, as are director-name and studio-name. birthplace is 16 bytes, title is 64 bytes, dob is 6 bytes, sex is 2, role is 10, and year is 2 bytes. budget, duration, ticket-sales, and salary are 4 bytes each. The table ACTOR has 50,000 rows; MOVIE has 100,000 rows, and APPEARS has 1,200,000 rows.
Assume the following:
- There are 500 pages of memory available.
- Each page (memory and disk) is 1K, which can be simplified to 1,000 bytes.
- A pointer is 10 bytes.
- The order quantity is greater than 10 in 5% of all orders.
- The country is Japan in 15% of products.
- The discount is less than 0.05 for 8% of all orders.
(a) (4 points) Calculate the blocking factor and the number of blocks each file uses.
(b) (4 points) Draw the initial (non-optimized) query tree for this query.
SELECT title, sum(salary), sum(budget) FROM ACTOR A, MOVIE M, APPEARS AP WHERE A.last-name = AP.actor-name and M.title = AP.title AND A.birthplace like Kentucky AND M.director-name = Spielberg GROUP BY title;