BUS708 Statistics and Data Analysis
Trimester 3, 2019
1 OVERVIEW OF THE ASSIGNMENT
This assignment will test your skill to collect, summarise and present data using Microsoft Excel and/or
other approved tools. It will also test your understanding to interpret the output produced by the
software to solve business problems.
You will need to use the dataset provided as well as collecting your own dataset and produce a
numerical and graphical summary. You will need submit an Excel file following the requirement as
2 TASK DESCRIPTION
There are two datasets involved in this assignment: Dataset 1 and Dataset 2, detailed below.
Dataset 1: You will receive an email that contains a dataset that is specifically allocated to you.
This dataset is edited from the original dataset provided by “Inside AirBNB” compiled on the
14 September 2019. The original dataset can be obtained from http://insideairbnb.com/getthe-data.html under the Creative Commons CC0 1.0 Universal (CC0 1.0) “Public Domain
Dedication” license. To view a copy of this license, visit
https://creativecommons.org/publicdomain/zero/1.0/. Some of the variables in the original
dataset have been removed and the number of cases has been reduced.
Dataset 2: You will need to collect a dataset via survey to answer the question given in
Section 6 below. You will need to collect data from at least 30 international students.
Both datasets should be saved in an Excel file (see Submission Requirement on the next page). All data
processing should be performed in Excel or Statkey (http://www.lock5stat.com/StatKey). Specific
instruction as to which tools should be used for each section will be given during tutorials.
Your tasks are to provide a description for each dataset in Section 1, and to answer the following
research questions given in Section 2 to Section 6 using dataset 1 or dataset 2 as indicated in each
1. Section 1: Description about Data
a. Dataset 1: Give a short but clear description about this dataset. Is this primary or
secondary data? What are the cases? How many variables are there in the dataset?
b. Dataset 2: Explain how you collect the data and discuss its limitation (e.g. whether
your sample is biased). Is this primary or secondary data? What are the variables and
2. Section 2: What are the proportions of different room types of AirBNB in Sydney?
Using Dataset 1, describe the proportion of the different room types available for rent
in Sydney AirBNB. You need to provide the frequency and the proportion (either as a
decimal or a percentage) as well as graphical display that easily shows the proportion
of the room types.
3. Section 3: What is the AirBNB price distribution of private room after an iteration of
Using Dataset 1, perform one iteration of outlier detection on the price of private
rooms using the method described in the lecture notes. After removing those
outliers, describe the price distribution of private rooms using both numerical and
graphical summary which shows the remaining outliers, if any.
4. Section 4: Is there a difference in the number of available days in the next 365 days among
different room types?
Using Dataset 1, describe the distribution of availability for 365 days in the future,
for each room type. You need to provide both numerical summary as well as
graphical display which shows the outliers, if any.
5. Section 5: Is there any relationship between Longitude and Price?
Using Dataset 1, describe the relationship between the longitude of an AirBNB
property location and its price. You need to provide both numerical summary as well
as graphical display.
6. Section 6: Is there any relationship between gender and room type accommodation?
Using Dataset 2, describe the relationship between the gender of an international
student and their current room type accommodation, e.g. whether the student
currently shares a room, lives in a private room (but shares an apartment or a
house) or lives in an apartment or a house by him/herself. You need to provide both
numerical summary and graphical display.
3 SUBMISSION REQUIREMENT
Deadline to submit the report: Week 7, Sunday 22 Dec 2019, 23:59
You need to submit an Excel file to Turnitin with the following requirements:
1. Your Excel file should have 8 worksheets with the following names and order: Section 1,
Section 2, …, Section 6, Dataset 1, Dataset 2.
2. In the first worksheet (Section 1), you should write your student number on the top left
corner, either on cell A1, or in a text box.
3. Failure to follow this requirement will result in mark deduction.
4 MARKING CRITERIA
Students are advised to read the marking rubric provided on Moodle. Detailed marking criteria
based on this rubric will be provided during tutorial week 6.
5 DEDUCTION, LATE SUBMISSION AND EXTENSION
Late submission penalty: – 5% of the total available marks per calendar day unless an extension is
approved. This means 0.75 marks (out of 15 marks) per day.
For extension application procedure, please refer to Section 3.3 of the Subject Outline. Please do
NOT email the lecturer or tutor to seek an extension, you need to follow the procedure described in
the Subject Outline.
Please read Section 3.4 Plagiarism and Referencing, from the Subject Outline. Below is part of the
“Students plagiarising run the risk of severe penalties ranging from a reduction through to 0 marks for a first
offence for a single assessment task, to exclusion from KOI in the most serious repeat cases. Exclusion has
serious visa implications.”
“Authorship is also an issue under Plagiarism – KOI expects students to submit their own original work in both
assessment and exams, or the original work of their group in the case of a group project. All students agree to a
statement of authorship when submitting assessments online via Moodle, stating that the work submitted is
their own original work.
The following are examples of academic misconduct and can attract severe penalties:
• Handing in work created by someone else (without acknowledgement), whether copied from another
student, written by someone else, or from any published or electronic source, is fraud, and falls under
the general Plagiarism guidelines.
• Students who willingly allow another student to copy their work in any assessment may be considered
to assisting in copying/cheating, and similar penalties may be applied. ”
BUS708 Statistics and Data Analysis