Spanish in Boston: Sociolinguistic Dataset & Analysis

tech_projects

Overview

This project involved designing and analyzing sociolinguistic datasets to investigate variation in Spanish speech. It demonstrates end-to-end experience in data creation, annotation design, quality assurance, and statistical analysis.

My Role

  • Designed annotation guidelines for novel linguistic variables
  • Managed dataset collection, curation, and QA workflows
  • Supervised and trained student annotators
  • Led full research lifecycle from data design to statistical modeling

Data & Methods

Outcome

  • Produced structured datasets for analyzing Spanish variation
  • Generated findings contributing to dissertation research
  • Demonstrated scalable approaches to linguistic data annotation and QA
Lee-Ann Vidal Covas
Authors
Language Scientist (PhD, Boston University) with expertise in sociolinguistic research, dataset curation, and applied data science.