This article was first posted on Towards Data Science.One of our customers recently posed this question:“I would like to set up an OKR for ourselves [the data team] around data availability. I’d like to establish a single KPI that would summarize availability, freshness, quality.What’s the best way to do this?”I can’t tell you how much joy this request brought me. As someone who is obsessed with data availability— yeah, you read that right: instead of sheep, I dream about null values and data freshness these days — this is a dream come true.Why does this matter?If you’re in data, you’re either currently working on a data quality project or you just wrapped one up. It’s the law of bad data — there’s always more of it.Traditional methods of measuring data quality are often time and resource-intensive, spanning several variables, from accuracy (a no-brainer) and completeness, to validity and timeliness (in data, there’s no such thing as being fashionably late). But the good news is there’s a better way to approach data quality.Data downtime— periods of time when your data is partial, erroneous, missing, or otherwise inaccurate — is an important measurement for any company striving to be data-driven. It might ...
Read More on Datafloq
No comments:
Post a Comment