[nycphp-talk] Flexible Forms & How to store them...
Kenneth Downs
ken at secdat.com
Wed May 16 07:35:49 EDT 2007
csnyder wrote:
> On 5/15/07, Jon Baer <jonbaer at jonbaer.com> wrote:
>> This is really a 2 layer approach not using MySQL for FT indexing.
>> Any other option gives you
>> way more IR functions than MySQL does. Although it would be fair to
>> point out that it is pluggable: ( <http://dev.mysql.com/doc/refman/
>> 5.1/en/plugin-full-text-plugins.html> ) But still pretty new.
>
> I'm all for the 2 layer approach -- at that point it doesn't really
> matter how you store the data in the db. YAML seems especially
> friendly, but it could be RDF, some other flavor of XML, or even a
> serialized PHP object. As a developer you get a tremendous amount of
> flexibility at a negative cost, because you don't have to write any
> complex SQL, ever, once you take care of CRUD.
>
> It scares the beejeesus out of anyone with a background in databases,
> though. The people I work with think I'm completely off my nut for
> storing data this way. They're asking me to use custom tables for even
> the simplest business objects, despite the fact that the model changes
> constantly. I'd love to find a technique that we could all embrace.
>
> If Ken Downs is reading this thread, I hope next Tuesday's talk will
> touch on this topic...
I've been following but I don't know enough about the underlying problem
to comment beyond general rules.
The approach under discussion is known as Entity-Attribute-Value and,
yes, it is considered a major mistake when used in a SQL database
because SQL is all about using rows and tables, and you can't do any of
that anymore. You've left yourself with no reasonable query language.
What used to be "select name,city from mytable" now requires a join (or
left join) at very least. The trouble only gets worse. Since
everything you are doing with a relational database requires SQL, and
since SQL has no tables to query, simple concepts like JOINS become
insane, performance drops, and as you said the database people start
calling you names and will refuse to help. Expect a sort of "it's your
funeral" attitude.
Here is an unusually non-bitter presentation on wikipedia:
http://en.wikipedia.org/wiki/Entity-Attribute-Value_model
The real underlying problems usually come down one or more of:
1) Unknown structure
2) Known structure, but seen as "too complex"
3) Structure *appears* to change often
4) Confusion between structured data and text data
5) Desire for "flexibility" in structure
6) Weak tools for creating/updating structures
For #1, we can probably agree if that if the structure is not yet known,
nobody should be coding. Go back to the users for more interviews.
For #2, the standard reply is that EAV is not going to solve the
underlying problem, its going to make it worse. Complex structures will
require complex queries, which are impossible using EAV in a SQL
database. This is probably a sticky situation where somebody needs to
go ask the paymaster for more money and time.
For #3, I underscore "appears" to change. While structures do change,
the appearance of rapid change can often be traced to other factors. A
skilled table designer can listen to the apparently random noise
generated by the users and discern patterns that the users are unaware
of. They will often think in details, we are supposed to be thinking
systematically.
For #4, what I mean is that you may have things like product
descriptions for an electronic catalog, where many different product
categories contain zillions of little details. A great example of this
is mwave. EAV looks good when confronting the bewildering prospect of
making a table for this stuff. But it often turns out that all items
have real attributes that must be in the database, not the least of
which are price and weight, but a lot of the rest of it is descriptive,
and can go into a text field (it can be html, or wiki text as well).
For #5, This is usually a symptom that the programmer does not yet know
if his problem is 1,2,3,4 or 6. He is having problems matching code to
table structure, and wants to solve that problem with "flexibility",
some way of lessening the pain of trying to keep these two coordinated.
It may be that he needs a better tool to do so (see #6), or he may be
confronting #3 and #4 and not realizing that he needs to nail down the
structure more.
For #6, in Ken's humble opinion the absence of strong tools is the
number one problem for those who handle tables every day. It is quite
simply a royal PITA to build tables, modify them, and then track changes
and get them into production. If you get into stored procedures and
triggers the problems with basic chores of version control, debugging
and so forth only get worse. This basic chore was the first thing I
coded in Andromeda, it's something that's crying out to be done. The
absence of these tools drives people to avoid structure changes and is
the number one reason people find themselves trying to write code that
will somehow not depend on the table structure. My own answer was to
write Andromeda so that it would be easy to match my code to the tables.
--
Kenneth Downs
Secure Data Software, Inc.
www.secdat.com www.andromeda-project.org
631-689-7200 Fax: 631-689-0527
cell: 631-379-0010
More information about the talk
mailing list