top of page

KPMG - Data Quality Assessment

Task 1

Data Quality Assessment

Assessment of data quality and completeness in preparation for analysis


Here is the background information on your task


Sprocket Central Pty Ltd, a medium-size bikes & cycling accessories organization, has approached Tony Smith (Partner) in KPMG’s Lighthouse & Innovation Team. Sprocket Central Pty Ltd is keen to learn more about KPMG’s expertise in its Analytics, Information & Modelling team.

Smith discusses KPMG’s expertise in this space (you can read more here). In particular, he speaks about how the team can effectively analyze the datasets to help Sprocket Central Pty Ltd grow its business.

Primarily, Sprocket Central Pty Ltd needs help with its customer and transactions data. The organization has a large dataset relating to its customers, but its team is unsure how to effectively analyze it to help optimize its marketing strategy.


However, in order to support the analysis, you speak to the Associate Director for some ideas and she advised that “the importance of optimizing the quality of customer datasets cannot be underestimated. The better the quality of the dataset, the better chance you will be able to use it to drive company growth.” The client provided KPMG with 3 datasets:

  • Customer Demographic

  • Customer Addresses

  • Transactions data in the past 3 months

You decide to start the preliminary data exploration and identify ways to improve the quality of Sprocket Central Pty Ltd’s data.



TASK

Draft an email to the client identifying the data quality issues and strategies to mitigate these issues. Refer to ‘Data Quality Framework Table’ and resources below for criteria and dimensions which you should consider.




SOLUTION

Dear [Client point-of-contact], Thank you for providing us with the three datasets from Sprocket Central Pty Ltd. The below table highlights the summary statistics from the three datasets received.


Notable data quality issues that were encountered and the methods used to mitigate the identified data inconsistencies are as follows. Furthermore, recommendations have been provided to avoid the recurrence of data quality issues and improve the accuracy of the underlying data used to drive business decisions.

Ensure Data quality in Customer Demographic DataSet

  • In the Gender Column, there are misspelled words

Recommendation: Replace misspelled and acronym F, M with Female, Male respectively


● In DOB Column , There is an outlier that the person with a age of 175

Recommendation : Remove the item from the dataset


● In Job title Column , Identified more Blanks

Recommendation: Remove all blanks


● In the Deceased Indicator Column , there are two entries Y , N

Recommendation: remove Y entries in column to get accurate results further


● Default Column , Entries that are improper and not valid

Recommendation: Delete the entire column


Ensure Data quality in Customer Address DataSet

● In-State Column , Entries of states are in Acronym and Fullname

Recommendation: Replace all State names with respective Acronym


Ensure Data quality in Transactions DataSet

● Remove blanks in ‘Online order’, ‘Brand’ columns


● In the list_price Column , the currencies are in number format

Recommendation: Replace the numbers into Currency Format


● In the Product_first_sold_date Column , there are numbers that specify nothing

Recommendation: Change the numbers to Short Date

Ensure Data quality in NewCustomerList DataSet

● In ‘past_3_years_bike_related_purchase’ and ‘Postcode’ Column , Wrong DataType - Numbers are stored as text entries

Recommendation: Select the entire column and convert it as a number


● Remove Blanks in ‘DOB’ , ‘Job Title’ Columns


● In the ‘Property Valuation’ Column , Numbers are stored as text entries and are in decimals once Changed

Recommendation: Select the entire column and convert it as a number then decrease decimal


● Remove Unwanted Columns that are generated by Random Function


● Name a column ‘Rank’ that specifies the RANK()


Click Here to Download Solution For Task 1





Comments


©2022 by The Analyst. Proudly created with Wix.com

bottom of page