Quantitative Methods for Economics and Finance

ECONS507-19T

Problem Set

Due: Thursday 5 December, 5pm.

Hand in your completed work to MSB2.11 (Susan’s office). You can paste any Excel workings as

required. Please show your answers to a maximum of three decimal places and most of the marks will

be for interpretation and presentation of the results. Please ensure that ALL the work you hand-in is

your own work. You may discuss how to answer the questions with friends, but each of you MUST write

up your own answers. Any work that is too similar will be reported to the Head of School.

You will be using 2 separate data sets for this problem set. These data are saved in the file

‘ECONS507Problem Set Data.xlsx’ available on Moodle, each on a separate worksheet (as labelled

Q1_income and Q2_lotteries).

1. [15 marks] A student surveyed 49 workers to obtain details on their annual income (from earnings),

whether they had a post-school qualification (1 if yes, 0 if no), and their age. These data are in the

worksheet called Q1_income

(a) Using Excel, carry out a t-test to determine if incomes of workers with post-school qualifications

exceed those of workers without qualifications. Be careful to state your null and alternative

hypothesis and to justify the level of significance you choose (Hint: consider the scenarios

where Type I-Type II errors might occur). (9 marks)

(b) Discuss whether the evidence in (a) proves or disproves the claim that education has a positive

causal effect on worker productivity (which is picked up in the labour market by more productive

workers being paid more since economic theory tells us that workers are paid according to their

marginal productivity). (6 marks)

2. [50 marks] Lotteries (such as Lotto in NZ) have become important sources of revenue for

governments. Many people have criticised lotteries, however, referring to them as a tax on the poor

and uneducated. In an examination of the issue, a random sample of 100 adults was asked how much

they spend on lottery tickets and was interviewed about various socioeconomic variables. The

purpose of this study is to test the four following beliefs:

(i) Relatively uneducated people spend more on lotteries than do relatively educated people.

(ii) Older people buy more lottery tickets than younger people.

(iii)People with more children spend more on lotteries than people with fewer children.

(iv)Relatively poor people spend a greater proportion of their income on lotteries than relatively rich

people.

The data for this question is in the worksheet called Q2_Lotteries.

(a) Outline the key statistics for each of the variables, by using the descriptive statistics function in

Data Analysis in Excel. Briefly describe each variable by referring to 3 or 4 of the statistics

presented in the table. (Hint: Present the table showing all the statistics identified in the 1st

column and then paste in the relevant information for each variable). (10 marks)

(b) Use Excel to draw a separate scatter plot for each independent variable against the dependent

variable (i.e. amount spent on lotteries as a % of total income LSpend). Include the linear trend

line for each graph. Using only the information from the graphs, which independent variable

appears to have the strongest linear relationship with proportion of total income spend on

lotteries? Briefly discuss if any of the belief’s i) to iv) are confirmed by the plots? (14 marks)

(c) Determine the correlation coefficient, using the correlation function in Excel, for each

independent variable against the amount spent on lotteries. Comment on each correlation and

also whether this correlation information confirms the four beliefs and your choice strongest

linear relationship in 2(b). (10 marks)

(d) (i) Develop a simple regression for each of the independent variables against the dependent

variable and summarise the results in the table below. Include the p-value underneath each

coefficient.

Estimated Coefficients (p-value) | Model (1) | Model (2) | Model (3) | Model (4) |

Intercept | ||||

Education | ||||

Age | ||||

# Child | ||||

Personal Income | ||||

R-Square | ||||

Significance-F |

(ii) Use this information to discuss which model is the best. Give an interpretation of the best

equation and one of the other three equations.

(iii) Discuss if the four beliefs are confirmed by their appropriate simple regression equation.

(10 marks)

(e) For the ‘best’ simple regression model in d(ii), analyze the residuals and comment on how well

they meet the assumptions of the Ordinary Least Squares (OLS) regression model.

( 6 marks)