The right way to think about Data Quality, from Kimball and Uber’s points of view

Photo by Maxime Agnelli on Unsplash

If you’ve spent any amount of time in business intelligence, you would know that data quality is a perennial challenge. It never really goes away.

For instance, how many times have you been in a meeting, and find that someone has to vouch for the numbers being presented?

“These reports show that we’re falling behind competitor X,” someone might say, and then gets interrupted —

“How do you know these numbers are right?”

“Well, I got them from Dave in the data team.”

“And you trust him?”

“We went through the numbers together. I can vouch for these numbers.”

“Alright…


Schlep exists in every data project, however, you begin to look out for it everywhere cool data projects exist

Photo by Pixabay on Pexels

Here’s an uncomfortable truth (or an annoying reminder, if you’re an old hand at this): before you can get anything interesting done in data, you need to get past a period of data schlep first.

What is data schlep? Schlep is an informal term that means tedious, boring work. You can already see where we’re going with this: data schlep refers to all the boring data cleaning and plumbing and transforming that you do at the start of any data project.

At Holistics, we sometimes talk to business leaders in South East Asia who are at the beginning of their…


Image by author

Goodhart’s law is a famous saying named after the British economist Charles Goodhart, which usually goes “when a measure becomes a target, it ceases to be a good measure.”

This idea is of interest to businesspeople, managers, and data analysts alike — and for good reason: companies are usually run using metrics, and few things are worse than a well-meaning metric turned bad.

A famous example of this is what is now called the ‘ cobra effect ‘. The story goes as follows: in India, under British rule, the Colonial government was concerned about the number of venomous cobras in…


The year was 1983. Larry Ellison, over at a tiny company called Oracle, was focused on the fallout of a buggy database product rewrite. In the rearview mirror, catching up quickly, was computer science professor and eventual database legend Michael Stonebraker.

In his book, Softwar, author Matthew Symonds tells the story like so:

Ellison was still not giving much of his attention to what was or wasn’t happening in sales. As far as Ellison was concerned, overwhelmingly the most important contribution he could make to Oracle’s success was to concentrate on making the product better. He simply didn’t regard himself…


Unsplash

To build a great reading culture, we have high standards for books at Holistics. If you’d asked us what we recommend to read for a career in data — we’d usually respond with some of the classics. And the bar for classics should be very high. This is for a good reason: new books can be faddish. A book about Redshift is all well and good — until Redshift stops being the hottest data warehouse in town.

In other words, if you want to invest a significant amount of time to read a book — better read something that’s timeless.


Photo by Adam Nowakowski on Unsplash

There are no answers in this piece, only two anecdotes in the service of a single question.

Onboarding in a Hot Local Startup

I was talking to a software engineer friend a few days ago. He had been watching his partner set up at their new job as a data analyst.

“The handover was pretty terrible!” he said, wringing his hands. “They literally gave her a .txt file with a SQL queries. No comments! Hardcoded dates for ‘past 30 days’! And this was at <name of well-known local eCommerce startup>!”

I laughed. “That’s actually pretty normal”, I said. “Did she have a mentor?”

“Well, yes”, my…


There’s something going on with ‘metadata hubs’ today

Image by author

Metadata hubs (or sometimes known as metadata search & discovery tools) seems to be another trend/movement that’s happening in the analytics space. In the past two years alone, we’ve seen a whole host of metadata hub projects being released, written about, or open sourced by major tech companies. These include:

In this post, I’ll quickly summarise the problem these products are…


DATA LOVER / RANDOM THOUGHTS

Over the past couple of months there have been a number of articles about the death of dashboards. See, for instance, Taylor Brownlow’s piece on Towards Data Science titled Dashboards are Dead, and the pithily named deathofdashboards.com.

The argument goes something like this:

  • Dashboards are the default option for analytics, and everyone gets a dashboard. Soon, there are too many dashboards in the enterprise. You find yourself with a serious report sprawl problem. It all sucks. The answer is to invert the report sprawl problem and to get rid of dashboards entirely.
  • With dashboards, you get ‘death by 1,000 filters’…


How the change in columnar databases cost structure has influenced the shift from pre-building OLAP cubes in the past to running OLAP workloads directly in-database.

Photo by Christian Fregnan from Unsplash

Update: Confused about the complex analytics landscape? Check out our book: The Analytics Setup Guidebook.

One of the biggest shifts in data analytics over the past decade is the move away from building ‘data cubes’, or ‘OLAP cubes’, to running OLAP* workloads directly on columnar databases.

(*OLAP means online analytical processing, but we’ll get into what that means in a bit).

The decline of the OLAP cube is a huge change, especially if you’ve built your career in data analytics over the past three decades.

This is a huge change, especially if you’ve built your career in data analytics over…


What do you think makes a good data analyst?

Update: Confused about the complex analytics landscape? Check out our book: The Analytics Setup Guidebook.

One question that came up while we interact with a lot of people working in (or wanting to get into) data industry is this: How should I prepare for a data analyst role, and how should I prepare for the interview?

In this post, we’re going to share some of our perspectives on this question. From there, you will be able to better equip yourself with the skills right before your moment.

First off, think about what problems the company are hiring you to solve

Not every data analyst’s role is the same, much as not every CEO’s…

Huy Nguyen

CTO of Holistics.io (self-service BI platform) — Confused about BI/analytics landscape? Read this book: https://www.holistics.io/books/setup-analytics/

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store