Data Cleaning
Keeping your dataset clean is an essential part of getting accurate, trustworthy results. The Glass platform gives you the tools to identify and remove poor-quality or over-quota responses directly — without losing visibility into what’s changed.
Most researchers have their own preferred ways to clean data. In this article we’ll guide you through our best practices and preferred approaches for data cleaning. These are all just suggestions to use or not use as you see fit!
Step 1: Download Raw Data
Start by exporting your full dataset so you can review responses carefully.
Go to your project’s Analytics tab.
In the left sidebar, click Summary Reports, then open the Raw Data tab.
If your project has multiple waves:
Click Filter Exports by Wave.
Select the wave you want to review.
Click Export to start the download.
You’ll be redirected to the Exports tab — wait until the status says Ready.
Click the download icon (cloud with a downward arrow) to download your file.
Step 2: Prepare the File
Open the file and enable editing.
Apply filters to all columns for easy sorting.
Freeze the top row so you can see the question headers as you scroll.
This will make it easier to scan and filter for potential quality issues.
Step 3: Evaluate Data Quality
Here are a few key indicators to check when cleaning your data:
Length of Interview (LOI)
Found in Column C, measured in seconds.
Calculate the average or median LOI.
Flag respondents who finished in less than 40% of the average.
Example: If the average is 15 minutes (900 seconds), flag anyone under 6 minutes (360 seconds).
Open-Ended Responses
Remove or flag respondents whose open ends are:
Gibberish, nonsense, or copy-pasted text.
One-word or overly short (e.g., “good,” “nice,” “idk”).
Extremely long or suspiciously polished (AI-generated-looking).
Contradictions or Inattention
Look for combinations of answers that don’t make sense together.
Check attention-check or red herring questions (e.g., fake brands).
Consider whether answers suggest fatigue or random clicking in longer surveys.
Straightliners (in matrix questions or grids)
If a respondent gives the same answer across a long battery of scale questions, it may indicate disengagement.
This is especially important when testing concepts or comparing attributes.
Laggards
Respondents with unusually long LOIs don’t always need removal. Only exclude them if they also show other signs of poor quality (like gibberish open ends or inconsistent responses).
Step 4: Remove Respondents in the Glass Platform
Once you’ve identified poor-quality or over-quota respondents, you can consider removing them from your dataset. Email your Glass account manager for more information.
Pro Tip: Build in quality controls (like red herrings or logic checks) during survey design to reduce cleanup later.
