Application#
Click here for a Google Doc template for this chapter’s application problems.
To include screenshots in your doc:
Tutorial for adding images/tables/drawings to a Google Doc
Windows (Snipping Tool for folks running older versions of Windows; Snip & Sketch for folks running updated versions)
We’ll also use a combined spreadsheet template for the data cleaning outputs.
Click here to make a copy of the Google Sheets template.
Question 1#
Q1A: Provide three (3) distinct examples from the sample datasets that do not conform to tidy data principles. Include the example as well as an explanation of how this example does not conform to tidy data principles.
Q1B: Take a step back and consider the pattern errors you’re seeing in these datasets. What trends do you notice? Any thoughts or ideas as to how or why these pattern errors might have occurred?
Q1C: How would you address these pattern errors so the data conforms to tidy data principles? Explain what steps you would take to address at least 3 pattern errors. Each error explanation should include three parts:
An example of the issue
An explanation of your method to address the issue
The same example as tidy data
Question 2#
Q2A: Starting with the pattern errors you identified for Q1, how would you address those issues using OpenRefine?
Q2B: Provide an outline for your data processing workflow. A set of steps or tasks is a good place to start. You’re welcome to include screenshots or sketches/diagrams if that would be helpful.
Q2C: Work through cleaning the players
and teams
tables. Once you’re done, export the .csv
files and add them to the Google Sheets template for this chapter.
Q2D: Compare your experience working in OpenRefine to other experiences you have had in a text editor or spreadsheet program. In what ways do you understand, perceive, or relate to the data differently through working in OpenRefine? Reflect on and describe your experience cleaning this data in OpenRefine.
Question 3#
Q3A: Starting with the pattern errors you identified for Q1, how would you address those issues using Excel?
Q3B: Provide an outline for your data processing workflow. A set of steps or tasks is a good place to start. You’re welcome to include screenshots or sketches/diagrams if that would be helpful.
Q3C: Work through cleaning the players
and teams
tables. Once you’re done, export the .csv
files and add them to the Google Sheets template for this chapter.
Q3D: Compare your experience working in a spreadsheet program to other experiences you have had in a text editor, spreadsheet program, or OpenRefine. In what ways do you understand, perceive, or relate to the data differently through working in a spreadsheet program? Reflect on and describe your experience cleaning this data in a spreadsheet program.
Question 4#
Q4A: Thinking about some of the pattern errors you identified for Q1, how would you address those issues using a form or survey template?
Q4B: Provide an outline or framework for a survey form the players
table.
You DO NOT need to actually create or submit a survey form.
Describe what types of questions and pre-defined question or field options could you use to more effectively generate the data in this file.
You’re welcome to include screenshots or sketches/diagrams if that would be helpful.
Question 5#
Q5A: Thinking about some of the pattern errors you identified for Q1, how would you address those issues using data validation in a spreadsheet program?
Q4B: Provide an outline or framework for a data validation template the players
table.
You DO NOT need to actually create or submit a template.
Describe what data validation options and pre-defined field options could you use.
You’re welcome to include screenshots or sketches/diagrams if that would be helpful.