To Merge or Not to Merge?

Merging data is convenient, but not always correct. This article explores the trade-offs between flat tables and dimensional models, and how those choices affect analysis accuracy.

Joseph Finch

1/7/20261 min read

To merge or not to merge?
That’s not a technical question. It’s a modeling decision.

While working with Our World in Data (OWID) education data, the temptation to merge everything into one flat table was constant. It’s fast. It’s convenient. It also hides risk.

Example:
Take average years of schooling and school enrollment rates. Both are tied to country and year, but they often apply to different age groups and methodologies. Merged into one table, they look compatible. Aggregate them carelessly, and you get clean charts with quietly wrong conclusions.

Flat tables are great for:

  • exploration

  • teaching

  • tightly scoped questions

But when data spans multiple grains, flat models blur meaning. Averages lie convincingly.

Dimensional models slow you down upfront, but they force clarity:

  • what is measured

  • at what grain

  • under which assumptions

The trade-off is simple:
Merging optimizes for speed.
Dimensional modeling optimizes for trust.

Good analysts aren’t dogmatic. They choose structure based on the question—and respect the consequences of that choice.