Application#

Click here for a Google Doc template for this chapter’s application problems.

To include screenshots in your doc:

We’ll also use a combined spreadsheet template for the data cleaning outputs.

Click here to make a copy of the Google Sheets template.

Question 1#

Q1A: Provide three (3) distinct examples from the sample datasets that do not conform to tidy data principles. Include the example as well as an explanation of how this example does not conform to tidy data principles.

Q1B: Take a step back and consider the pattern errors you’re seeing in these datasets. What trends do you notice? Any thoughts or ideas as to how or why these pattern errors might have occurred?

Q1C: How would you address these pattern errors so the data conforms to tidy data principles? Explain what steps you would take to address at least 3 pattern errors. Each error explanation should include three parts:

  • An example of the issue

  • An explanation of your method to address the issue

  • The same example as tidy data

Question 2#

Q2A: Starting with the pattern errors you identified for Q1, how would you address those issues using OpenRefine?

Q2B: Provide an outline for your data processing workflow. A set of steps or tasks is a good place to start. You’re welcome to include screenshots or sketches/diagrams if that would be helpful.

Q2C: Work through cleaning the players and teams tables. Once you’re done, export the .csv files and add them to the Google Sheets template for this chapter.

Q2D: Compare your experience working in OpenRefine to other experiences you have had in a text editor or spreadsheet program. In what ways do you understand, perceive, or relate to the data differently through working in OpenRefine? Reflect on and describe your experience cleaning this data in OpenRefine.

Question 3#

Q3A: Starting with the pattern errors you identified for Q1, how would you address those issues using Excel?

Q3B: Provide an outline for your data processing workflow. A set of steps or tasks is a good place to start. You’re welcome to include screenshots or sketches/diagrams if that would be helpful.

Q3C: Work through cleaning the players and teams tables. Once you’re done, export the .csv files and add them to the Google Sheets template for this chapter.

Q3D: Compare your experience working in a spreadsheet program to other experiences you have had in a text editor, spreadsheet program, or OpenRefine. In what ways do you understand, perceive, or relate to the data differently through working in a spreadsheet program? Reflect on and describe your experience cleaning this data in a spreadsheet program.

Question 4#

Q4A: Thinking about some of the pattern errors you identified for Q1, how would you address those issues using a form or survey template?

Q4B: Provide an outline or framework for a survey form the players table.

  • You DO NOT need to actually create or submit a survey form.

  • Describe what types of questions and pre-defined question or field options could you use to more effectively generate the data in this file.

  • You’re welcome to include screenshots or sketches/diagrams if that would be helpful.

Question 5#

Q5A: Thinking about some of the pattern errors you identified for Q1, how would you address those issues using data validation in a spreadsheet program?

Q4B: Provide an outline or framework for a data validation template the players table.

  • You DO NOT need to actually create or submit a template.

  • Describe what data validation options and pre-defined field options could you use.

  • You’re welcome to include screenshots or sketches/diagrams if that would be helpful.