[nycphp-talk] table structure for "friend" relationships
Paul A Houle
paul at devonianfarm.com
Fri Jul 31 11:14:41 EDT 2009
Hans Zaunere wrote:
> In many modern web application, relational pedantic have been fading.
> Denormalization is often a good thing, and very large sites are in fact
> highly denormalized. This is due to the type of data they need to store
> (friend relationships is a classic one), and the level of throughput they
> need to achieve.
>
I think about 90% of people who scream "down with normalization" are
the same people who were creating Microsoft Access databases 10 years
ago with three phone number columns rather than creating a separate
table for phone numbers. (I spent about 6 months fixing those kind of
apps, and that's enough for me...)
In the "social media" sorts of applications, I think it's wise to
pursue an 'eventually consistent' strategy. If you don't think about
consistency at all, you're going to have a day that your database
wrecks and you find it's a pile of spaghetti that doesn't make any
sense. One strategy is to have a normalized 'core' and then denormalize
the data to make views that are very fast. I'm doing a lot of semantic
web stuff these days and I'm finding that I need to create materialized
views all the time.
Myself I'm feeling torn between wanting schema flexibility and
wanting to have richer schemas: I want more data types and data
dictionaries that record more about what data in the database means:
with a rich schema, many parts of your apps can "write themselves."
I've also got a lot of fear that "schemaless" systems are going to
be "futureless" systems. Look at the sad story of Java serialization
and object databases: once an object gets persisted into a database,
you can't make changes to it as easily as you can make changes to an
ordinary object that lives in RAM. Object database vendors haven't come
up with a decent story for how to migrate databases over the long term.
On the other hand, I've seen relational databases in business
applications that have survived 5, 10, even 25 years of changing
business requirements. Relational databases seem to have hit about the
right balance between being easy and hard to change the schema.
----------
My dream database is something like the CycL system created for the
Cyc project, though I'd add full RDF capability and dump most of the
Cyc ontology on the floor. Add SPARQL and relational-mapped views...
More information about the talk
mailing list