Ethics and Privacy in Data Analytics

BUS5PB Semester 2 2022
1
BUS5PB Principles of Business Analytics
Semester 2 2022
Assignment 3: Ethics and Privacy in Data Analytics
Due: 3rd November 2022 Thursday @ 11:59pm
Release Date: 10th October 2022
Due Date: 3
rd November Thursday @ 11:59pm
Assignment Type: Individual
Weight: 40%
Submission Format: A written report (electronic form) on the LMS site.
The third assignment focuses on the important topic of ethics and privacy of data analytics. There are
two tasks that you are required to complete and provide answers to the questions outlined below. You
should use lecture material as well as validated, external literature to support your responses.
Task 1 (18 marks)
Read and analyse the following case study to provide answers to the given questions.
Chelsea is a lead consultant in a top-level consulting firm that provides consultant services including how
to set up secure corporate networks, designing database management systems, and implementing
security hardening strategies. She has provided award winning solutions to several corporate customers
in Australia.
In a recent project, Chelsea worked on an enterprise level operations and database management
solution for a medium scale retail company. Chelsea has directly communicated with the Chief
Technology Officer (CTO) and the IT Manager to understand the existing systems and provide progress
updates of the system design. Chelsea determined that the stored data is extremely sensitive which
requires extra protection. Sensitive information such as employee salaries, annual performance
evaluations, customer information including credit card details are stored in the database. She also
uncovered several security vulnerabilities in the existing systems. Drawing on both findings, she
proposed an advanced IT security solution, which was also expensive due to several new features.
However, citing cost, the client chose a less secure solution. This low level of security means employees
and external stakeholders alike may breach security protocols to gain access to sensitive data. It also
increases the risk of external threats from online hackers. Chelsea strongly advised that the system
should have the highest level of security. She has explained the risks of having low security, but the CTO
and IT Manager have been vocal that the selected solution is secure enough and will not lead to any
breaches, hacks or leaks.
a) Discuss and review how the decision taken by the CTO and IT Manager impacted the data privacy and
ethical considerations specified in the Australia Privacy Act and ACS Code of Professional Conduct and
Ethics
(7 marks)
ACS Code of Professional Conduct: https://www.acs.org.au/content/dam/acs/rules-andregulations/Code-of-Professional-Conduct_v2.1.pdf
ACS Code of Ethics: https://www.acs.org.au/content/dam/acs/acs-documents/Code-of-Ethics.pdf
Australia Privacy Act: https://www.oaic.gov.au/privacy/the-privacy-act
b) Should Chelsea agree or refuse to implement the proposed solution? Provide your recommendations
and suggestions with appropriate references to handle the conflict.
(7 marks)
BUS5PB Semester 2 2022
2
c) Suppose you are a member of Chelsea’s IT security team. She has asked you to perform a k-anonymity
evaluation for the below dataset. The quasi-identifiers are {Sex, Age, Postcode} and the sensitive
attribute is Income.
In the context of k-anonymity: Is this data 1-anonymous? Is it 2-anonymous? Is it 3-anonymous? Is it 4-
anonymous? Is it 5-anonymous? Is it 6-anonymous? Explain your answer.
(4 marks)

ID Age Postcode Sex Income ($)
1 20-25 308* male 95000
2 20-25 318* male 80000
3 40-45 318* male 100000
4 30-35 308* female 90000
5 20-25 308* male 80000
6 30-35 308* female 80000
7 30-35 318* female 950000
8 40-45 308* male 105000
9 40-45 318* male 75000
10 20-25 308* male 90000
11 30-35 308* female 92000
12 40-45 308* male 100000
13 20-25 318* male 80000
14 40-45 308* male 87000
15 30-35 318* female 87000

Task 2 (22 marks)
There is a case study provided and you are required to analyse and provide answers to the questions
outlined below.
Josh and Hannah, a married couple in their 40’s, are applying for a business loan to help them realise
their long-held dream of owning and operating their own fashion boutique. Hannah is a highly promising
graduate of a prestigious fashion school, and Josh is an accomplished accountant. They share a strong
entrepreneurial desire to be ‘their own bosses’ and to bring something new and wonderful to their local
fashion scene. The outside consultants have reviewed their business plan and assured them that they
have a very promising and creative fashion concept and the skills needed to implement it successfully. The
consultants tell them they should have no problem getting a loan to get the business off the ground.
For evaluating loan applications, Josh and Hannah’s local bank loan officer relies on an off-the-shelf
software package that synthesises a wide range of data profiles purchased from hundreds of private data
brokers. As a result, it has access to information about Josh and Hannah’s lives that goes well beyond
what they were asked to disclose on their loan application. Some of this information is clearly relevant to
the application, such as their on-time bill payment history. But a lot of the data used by the system’s
algorithms is of the kind that no human loan officers would normally think to look at, or have access to —
including inferences from their drugstore purchases about their likely medical histories, information from
online genetic registries about health risk factors in their extended families, data about the books they
read and the movies they watch, and inferences about their racial background. Much of the information
is accurate, but some of it is not.
A few days after they apply, Josh and Hannah get a call from the loan officer saying their loan was not
approved. When they ask why, they are told simply that the loan system rated them as ‘moderate-to-high
risk.’ When they ask for more information, the loan officer says he does not have any, and that the
software company that built their loan system will not reveal any specifics about the proprietary
algorithm or the data sources it draws from, or whether that data was even validated. In fact, they are
told, not even the developers of the system know how the data led it to reach any particular result; all

BUS5PB Semester 2 2022
3
they can say is that statistically speaking, the system is ‘generally’ reliable. Josh and Hannah ask if they
can appeal the decision, but they are told that there is no means of appeal, since the system will simply
process their application again using the same algorithm and data, and will reach the same result.
Provide answers to the following questions based on what we have studied in the lectures. You may also
need to conduct research on literature to explain and support your points.
a) What sort of ethically significant benefits could come from banks using a big-data driven system to
evaluate loan applications?
(3 marks)
b)
What ethically significant harms might Josh and Hannah have suffered as a result of their loan denial?
Discuss at least
three possible ethically significant harms that you think are most important to their
significant life interests.
(8 marks)
c)
Beyond the impacts on Josh and Hannah’s lives, what broader harms to society could result from the
widespread use of this loan evaluation process?
(3 marks)
d)
Describe three measures or best practices that you think are most important and/or effective to
lessen or prevent those harms. Provide justification of your choices and the potential challenges of
implementing these measures.
(8 marks)
Hint: Your suggestion should align with the harms that you have discussed in the previous sections (Tasks
2b and 2c). You may review the lecture slides and select the relevant knowledge points. You may also
need to perform research on literature to explain and support your points.
Report Guidelines
1. The report should consist of a ‘table of contents’, an ‘introduction’, logically organised sections or
topics, a ‘conclusion’ and a ‘list of references’.
2. You may choose a fitting sequence of sections for the body of the report. Two main sections for the
two tasks are essential, and the subsections will be based on each of the questions given for each
task (label them accordingly).
3. Your answers should be presented in the order given in the assignment specifications.
4. The report should be written in Microsoft Word (font size 11) and submitted as a Word or PDF file.
5. You should use either APA or Harvard reference style and be consistent with the reference style
throughout your report.
6. You should also ensure that you have used paraphrasing and in-text citations correctly.
7. Word limit:
2500-3000 words (should not exceed 3000 words).
8. The final submission will comprise
only one file, which is the written report as a Word or PDF file.
Do not compress this file into a zipped archive.
<Student_ID>_Assignment3_Report.doc OR
<Student_ID>_Assignment3_Report.pdf
BUS5PB Semester 2 2022
4
Marking Rubric
A grade will be awarded to each of the tasks and then an overall mark determined for the entire
assessment. The rubric below gives you an idea of what you must achieve to earn a certain ‘grade’.
As a general rule, to meet a ‘C’, you must first satisfy the requirements of a ‘D’. And for an ‘A’, you must
first satisfy the requirements of a ‘B’, which must of course first meet the requirements of a ‘C’ and so
on.
Giving the nature of the two tasks, we have a single marking rubric as follows.

Grade Criterion
Not Pass (N)
0% – 49%
The most basic answer is provided but no arguments or no evidence of
understanding the different aspects of data ethics and privacy.
Pass (D)
50% – 59%
The basic answer is provided but the arguments are unclear, incomplete or
shows a lack of thought. The answers clearly draw materials and insights
from a limited source or is put together just enough to present an
argument. The answers appear to be responding from the “common sense”
of an individual rather than taking time to understand the deeper aspects
of data ethics and privacy.
Credit (C)
60% – 69%
The answer presented is given some thought but draws its discussion
points from inadequate or limited sources. The arguments presented is
logical and articulated. While the argument is acceptable, it could be
stronger or more considered. The answers should demonstrate a step up
from a “common sense” response.
Distinction (B)
70% – 79%
There is good depth to the discussion and the arguments presented goes
beyond what is covered in the lecture material. It is logical, articulated and
well-backed by research into data ethics and privacy. There is evidence that
the answer is well- thought out and there is a demonstrated understanding of
data ethics and concerns in the relevant local context and the context of the
problem.
High Distinction (A)
80% – 100%
Pretty much faultless, achieving what a B grade has done but taken to a level
where the argument presented is highly thought-out, well- connected
between one another, and clearly demonstrates an understanding of the case
presented. If required, relevant assumptions are made and these assumptions
are logical and evidence driven. Additional research undertaken to support
arguments of data ethics and privacy is also further supported by literature.

Other important Information
Standard plagiarism and collusion policy, and extension and special consideration policy of this
university apply to this assignment.
A cover sheet is NOT required. By submitting your work online, the declaration on the
university’s assignment cover sheet is implied and agreed to by you.