Should I use a has_many association or just repeat the information?

So in my app every country has its top 10 news articles for the week, and every week these articles get replaced by a new set of articles. Which of the following should i do:

1) Create a table for each country that’s attributes would be the 10 articles plus some other info.

2) Have a country table which has attributes like its ISO code and each country has_many articles which reference an article that has_one country.

3) Or just one table that is articles and I filter by queries where column: country_code == ‘KR’ or something?

Let’s see what these would look like in database terms:

Option 1: We would have a “country_and_articles” table that probably looks something like:

The clue is in the name – a single table representing two things. Urg.

Also, what if we want more than ten articles for a given week. Our design has us locked in.

The third option looks a bit better:

We’re only dealing with articles, but we’re going to have ten for each country code for each week. So we need to select on a country code and a date to pull out that particular week’s articles – and they’re not in any particular order (although we could add a position: integer to sort it).

One of the rules of database normalisation is that any repeating elements should be moved off into their own table. Here, we’re repeating two elements – the country code and the week. Why is this important? In the olden days there were issues around database size – repeating lots of data instead of just having it in one place with multiple references is inefficient. There are also issues around modifying that data – what if the People’s Republic of Agricultura (country code PRA) changes its name to the Democratic Republic of Industria (country code DRI) – we have to go through all records and update them. Which for a big, high performance database could cause issues.

So actually what we want is the second option. The second option plus. As noted above, we have two repeating elements – the country code and the week. So our ideal database design would look like this:

We have a representation of our countries and our weeks – and the articles themselves are attached to both. This is nice and clean – each table represents one concept, and one concept alone. And the use of indexes means we shouldn’t hit any performance issues when looking at all articles for a particular week, or all articles for a given country – or both.

Once again, it’s a bit more work than the other two solutions, a bit more typing, a little more overhead. But the clarity it provides is all important as your application ages and you have to fight against time to get things shipped before your client yells at you.

Do you know what to do but not how it works?

Ever wanted to understand why Rails views work the way that they do? Why variables from your controllers are visible inside your views?

Sign up below to get a free 5 part email course on how Ruby on Rails view rendering works and gain a deep understanding of the Rails magic.

We will send you the course, plus the occasional update from this web-site. But no spam, we promise, and it's easy to unsubscribe