Hi Trevor,
This is a good paper to read about the basics of Column-oriented databases.
http://db.lcs.mit.edu/projects/cstore/vldb.pdfIf you goto the Section 2 - Data Model. He has shown the data model, with a sample EMP table.
The example shows that EMP table contains four columns - Name, Age, Dept, Salary
From this table, projections are being formed - (In the paper, they have shown the creation of four projections for Example 1)
EMP1 (name, age)
EMP2 (dept, age, DEPT.floor)
EMP3 (name, salary)
DEPT1(dname, floor)
As you can see, the same column information gets duplicated in different projections.
The advantage is that if a query is around name and age, it need not skim around other details. But the storage requirements go high, since there is redundancy. As you may know, if you increase data redundancy, it will help selects at the cost of inserts, updates and deletes.
This is what i was trying to say.
Thanks,
Gokul.