NetBeans 6 delivers great updates to the Matisse GUI builder. Spend a few minutes with Roman Strobl and get an expert briefing on what's new and what has changed. (sponsored)
In this, the third and final installation of Andres' Introduction to Groovy series, you learn about how Groovy handles variable numbers of arguments, named parameters, currying, and more about Groovy operators. Including, some new operators.
Swing Fuse (actually just Fuse), is a framework designed to make it easier to create your own custom desktop components. In this article, Daniel Spiewak shows you how to get started and provides sample source code you can download.
Willam Louth shows how he uses JXInsight Probes to investigate probable performance issues with code bases that he is not familiar with. He also highlights possible pitfalls in creating a benchmark, as well as in the analysis of results.
Replies:
21 -
Pages:
2
[
12
| Next
]
Threads:
[
Previous
|
Next
]
At last with JDO and EJB, with Hibernate and TopLink, with Kodo and OJB, there is no longer any need to write SQL, or even use a relational database. So where's the catch?
Relational databases originally won out over hierarchical or file-based databases not because they were more performant, more efficient or had better support, but because of the qualities of the relational model, which has since the early days geen accessed through the
(arguably broken)
Structured Query Language (SQL). SQL is a victim of its own history and success. It is overly verbose. It lacks symmetry. It struggles to be effective without resorting to extensions to the agreed standard. It's in a different league compared to Java and C#, for example. These object oriented languages are so much more clean and exact in the way they represent and manipulate data as well as being infinitely more flexible and frankly, let's face it, more beautiful.
The fact is that this beauty hides a mess of ugliness beneath, whereas the ugliness of SQL hides a quiet beauty that is not even noticed by most casual writers of select statements.
So what is object-oriented programming's dark secret? Simply put, the strength of OO derives from the fact that it generates structures in programming languages that are akin to the way human minds happen to process information. We love to classify things, put boxes around them, create abstractions where a complex concept can be treated in a way that doesn't need to include the details. We use these structures very effectively: we can classify all of the organisms on earth into Kingdoms, Families and Species: one huge inheritance hierarchy of evolution. We can simultaneously understand the nature of an aeroplane as being like a bus (except that it flies) or like a bird (except that it's man-made, and bigger), both being different abstract ways of understanding the same object. And object oriented programming helps us all the way: for every abstraction we can think of, there is an interface. For every 'thing' that exists in our simulated world, there is an object class. This is 'common sense injection' and we can use it to structure our very code, so that things that don't make sense can't even be expressed in our language.
There is a problem however, with common sense injection, and it is this. This approach to modelling systems is only one possible way of doing things, which doesn't work very well in many cases. Einstein did not develop special relativity by thinking about space an time using his common sense, rather he took a simple observation (that light needs no frame of reference to limit its speed) and then used mathematics to derive the unavoidable, but distinctly non-common sense conclusions. Neither did Keynes come up with his ideas about economics by thinking in accepted common sense terms. Sometimes our common sense does not permit us to reach a deeper understand of an underlying model. The reason for us organizing our thoughts in the way we do has troubled philosophers since
Plato
, but the fact is, that despite two and a half centuries of effort, nobody has come up with a convincing explanation. There is nothing intrinsically correct about organizing the world in this way, and modelling programs as such is only useful because that's how most people's minds work. Formal, correct and complete as it may seem, object oriented programming has its basis in arbitrary human behaviour.
SQL is different. Behind SQL, and somewhat obscured by its clunky syntax, sits a body of number theory called relational calculus which, like the rules that tell us what to do when adding, subtracting or multiplying two numbers, provide us with a set of operations we can apply to 'relations' which equate roughly to tables in a relational database. SQL as a language evolved as a de facto standard in the era of COBOL and has been slow to modernize its syntax and clean up its idiosyncrasies because, to be honest, it isn't in the commercial interests of any of the database vendors to do it. There has been agonizingly slow progress with the ANSI process and now at least some of the features introduced by vendors in the early 90's are starting to become available through this standard - at the same time as the vendors are further extending their database's capabilities. Of course there have long been calls for a proper
symbolic
relational calculus as an agreed standard to access relational databases, but that's another story.
The bottom line is that the relational model is a formal mathematical representation for data. Mathematical abstractions are useful because they help us to solve problems that common sense finds difficult to access.
Alice starts north from Abbotsville going at 50mph and Bob starts from Boxford at 40mph at the same moment. Boxford is 20miles directly north of Abbotsville. Ignoring relativistic effects, and assuming both travel in a straight line north, how far from Boxford are they when Alice passes Bob.
Go on, solve it, you know you want to. The fact is, most people solve this kind of problem by writing down an equation and solving the equation. While solving the mathematics, they are not thinking about what the variable in the equation represent, they are going through a logical process to reach an answer that is difficult to access using a common sense approach (and even harder if you do take relativity into account).
Relational databases are a formal mathematical representation that is useful when solving data problems. It so happens that SQL is the 'standard' for accessing the relational algebra needed to manipulate this relational data. This highlights the true impedance mismatch between object oriented programming and relational databases. The former is an arbitrary, but common sense model of the world wheras the latter is a formal mathematical abstraction of the data. The problem is not helped by the idiosyncrasies and dialects of SQL, but this fact actually hides the true nature of what is required to reconcile these two views.
So much yada yada, so where are the examples? The problem with coming up with approaches to dealing with complex systems, is that the simple examples don't really shed much light on the problem. However, let's give it a go. Suppose we are building a stock tracking system. We have stock items that have a stock index, a colour, a shape and a price. We have clients in the database with a name, address and telephone, and purchase records that record which clients bought which stock items for which price.
The system is pure common sense, there are 'person' things representing clients, 'stockitem' things representing lines of stock, and 'purchase' items representing a transaction. You have implemented this system in an object oriented language, using an object oriented, XML, or other non-relational database. Everything has been working well for years, and you little company has gone from strength to strength, reaping in the profits. So well, in fact, that you decide to buy out another company and leverage your loyal client base. You now have a new requirement for your little database for doing cross selling:
Give me a list of all the people who bought blue triangular items in Boxford.
Give me a list of people called Smith who paid more than $80 in a single purchase.
What has happened here is that the common sense injection in your original application has let you down. This application had a single purpose: keeping stock records and recording client addresses and payments. Now you want to use it for data mining: a purpose for which it wasn't designed, and you end up having to do a major rewrite to have it answer these simple questions. Had the application been created with a relational database, these questions would be very simple to answer, though it might be difficult, particularly with a real-world problem, to map the answers into your object model using an Object/Relational mapping layer that does not give you full access to the underlying SQL.
So moving back to the original question: has the impedance problem been solved? The risk is that many solutions that map object data to relational databases and vice versa sacrifice the power of the relational model in doing so. By eliminating SQL, or offering a solution that is agnostic as to whether it is using a relational, object or XML database under the covers, you lose the flexibility to change your mind, in the future, about what you want your database to do. And that's got to be a bad thing.
Re: Has the Objects/Relational Impedance Mismatch been Solved?
>
>
So moving back to the original question: has the
> impedance problem been solved? The risk is that many
> solutions that map object data to relational
> databases and vice versa sacrifice the power of the
> relational model in doing so. By eliminating SQL, or
> offering a solution that is agnostic as to whether it
> is using a relational, object or XML database under
> the covers, you lose the flexibility to change your
> mind, in the future, about what you want your
> database to do. And that's got to be a bad thing.
I don't think I agree with your statement that solutions that map object data to relational databases sacrifice the power of the relational model.
It seems to me that we get the best of both worlds. We can write code that uses objects (which makes for code that is easier to understand and easier to maintain). And we can fall back to SQL for reporting, answering questions, and data mining.
Not sure I see a problem here. Except that I don't like SQL and wish there were more a better, more widely supported alternative for using relational calculus.
Re: Has the Objects/Relational Impedance Mismatch been Solved?
> These object oriented languages are so much more clean and
> exact in the way they represent and manipulate data as well
> as being infinitely more flexible and frankly, let's face
> it, more beautiful.
Huh!? Really? This is a very, very bold claim to make without offering any evidence or argument for it. Show some examples of how an OO language offers more cleaner or more flexible data manipulation.
> Had the application been created with a relational
> database, these questions would be very simple to
> answer, though it might be difficult, particularly with
> a real-world problem, to map the answers into your
> object model using an Object/Relational mapping layer
> that does not give you full access to the underlying SQL.
This just not true. There is no problem expressing these queries in an OO query language like HQL / EJB-QL.
> So moving back to the original question: has the
> impedance problem been solved? The risk is that many
> solutions that map object data to relational databases
> and vice versa sacrifice the power of the relational
> model in doing so. By eliminating SQL, or offering a
> solution that is agnostic as to whether it is using a
> relational, object or XML database under the covers, you
> lose the flexibility to change your mind, in the future, > about what you want your database to do. And that's got
> to be a bad thing.
None of this is true. The whole point of ORM is to allow you to have your cake and eat it. Your data modeller can create a correct relational model. You can map that to a convenient object model. An OO query language like HQL/EJB-QL is so similar to SQL that you lose nothing. Solutions like EJB3 persistence and Hibernate are NOT data model agnostic, rather they are targetted directly at relational databases.
Frankly, it sounds like you have never actually used a modern ORM solution.
Re: Has the Objects/Relational Impedance Mismatch been Solved?
Hello David
Obviously there are situations where O/R mapping tools are not suited, and where direct manual database manipulation is required.
But in most cases O/R mapping tools are able to deal with specifics or the underlying data sources, and they now provide a lot of tuning options.
Most of your concerns are related to first generation of O/R mapping tools, in the 90s. Most modern products have dropped such limitations. Besides this modern standards like JDO 2 and EJB 3 support native SQL queries and mix of O/R mapping and direct JDBC. Maybe you didn't play with an O/R mapping product since a long time, or you don't know them very well in terms of internal design.
Re: Has the Objects/Relational Impedance Mismatch been Solved?
Gavin,
Thanks for your feedback, I agree with much but not all.
> > These object oriented languages are so much more clean and
> > exact in the way they represent and manipulate data as well
> > as being infinitely more flexible and frankly, let's face
> > it, more beautiful.
>
> Huh!? Really? This is a very, very bold claim to make
> without offering any evidence or argument for it.
> Show some examples of how an OO language offers more
> cleaner or more flexible data manipulation.
>
This was intended as an ironic jab at anyone who might deny that relational is a useful model.
> > Had the application been created with a relational
> > database, these questions would be very simple to
> > answer, though it might be difficult, particularly with
> > a real-world problem, to map the answers into your
> > object model using an Object/Relational mapping layer
> > that does not give you full access to the underlying SQL.
>
> This just not true. There is no problem expressing
> these queries in an OO query language like HQL /
> EJB-QL.
>
The illustration was really to highlight the shortcomings of the pure object-oriented database approach. HQL / EJB-QL work well for this simple case.
> > So moving back to the original question: has the
> > impedance problem been solved? The risk is that many
> > solutions that map object data to relational databases
> > and vice versa sacrifice the power of the relational
> > model in doing so. By eliminating SQL, or offering a
> > solution that is agnostic as to whether it is using a
> > relational, object or XML database under the covers, you
> > lose the flexibility to change your mind, in the future,
> > about what you want your database to do. And that's got
> > to be a bad thing.
>
> None of this is true. The whole point of ORM is to
> allow you to have your cake and eat it. Your data
> modeller can create a correct relational model. You
> can map that to a convenient object model. An OO
> query language like HQL/EJB-QL is so similar to SQL
> that you lose nothing. Solutions like EJB3
> persistence and Hibernate are NOT data model
> agnostic, rather they are targetted directly at
> relational databases.
>
I would have to disagree with this last statement for a number of reasons.
Firstly, I agree that Hibernate and EJB 3 are targetted at relational databases, though other solutions are not.
There is in fact a fundamental difference between HQL/EJB-QL and SQL in that the former are making assumptions about what the data model means whereas the uses the relational algebra on data in a relational model. One repercussion of this in practise is that if you want to look at the same data in two slightly different ways, you need to lay a different object model over the top of it for each of those views.
For example, consider a database that stores information about financial products. There are tens of different identifiers that are used to reference financial products (CUSIP, SEDOL, ISIN, RIC, Ticker, etc.), some of which are better for different markets or different instrument types. A strategic database employs a denormalized form of this data where a 'Product' and 'Identifiers' are stored in separate tables, and each identifier has a reference to the product, a type and a value.
Meanwhile, a domain-specific user of this data might be operating in a market where she only cares about two possible identifiers and for convenience wants to work with an object model over this data that contains the two chosen identifiers embedded directly as attributes of the product. It is quite simple to write the SQL to perform this translation, but what is the overhead of maintaining two different object models over the same data?
It can be hard to access vendor-specific database features through an object query language or Hibernate's criteria queries, and Hibernate/EJB risk having to play catch-up with the JDBC driver writers. From the developer's standpoint, it can sometimes feel like you are coaxing the tool into writing SQL that you know perfectly well how to write yourself. It is true that native SQL features bypass these issues, but there are limits to the complexity of queries that can be mapped. For example, (and correct me here if I'm wrong) you can't read five different objects and resolve all their references in a single query operation.
In general, you speak of things as they should be in an ideal world for a project that is being developed from scratch. There is in your statement, a tacit assumption of communication between the data modeller and the object designer which is reasonable for small to medium sized projects, but may not be possible if the database designer is part of a different part of the organization (as happens with strategic databases in large firms), or the database was developed it at a different time.
> Frankly, it sounds like you have never actually used
> a modern ORM solution.
In fact, I have written a modern ORM solution
(Hydrate)
which I think takes a pretty fresh approach to this problem that is worth considering. Hydrate is open source, btw. In a separate project I have made strong use of Hibernate and it is in this context that this post was made.
Thanks again for you comments - looks like I misrepresented myself as an ORM luddite which I'm clearly not, just someone who thinks that there is always scope for improving already-great products. Take a look at Hydrate if you have a moment - I can't think of a better person to get feedback from
Re: Has the Objects/Relational Impedance Mismatch been Solved?
Well, to begin with SQL looks more like a language family rather than a single language. The standard has statements like 'There shall be some functions to manipulate strings.'. Hardly something that leads to a robust and portable solution.
And then what about using an outer join in a query. The various SQL implementations vary tremendously here. So, why not use SQL - because SQL isn't a language, it's a family of somewhat similar languages.
HQL/EJBQL are portable (far more than could be said for SQL) whilst not being that disimilar to SQL.
Re: Has the Objects/Relational Impedance Mismatch been Solved?
> HQL/EJBQL are portable (far more than could be said
> for SQL) whilst not being that disimilar to SQL.
HQL is portable between Hibernate and ...? EJBQL might be more portable, but the main reason is the very small number of products supporting EJBQL.
As a matter of fact, I am working with a large enterprise application that are using (a subset of) ANSI SQL, which makes the application portable between a number of different RDBMS vendors. It is not difficult to make a portable SQL application. For which vendors do you have problems with portability with joins?
Re: Has the Objects/Relational Impedance Mismatch been Solved?
> From the developer's standpoint, it can sometimes feel like you are coaxing the tool into writing SQL that you know perfectly well how to write yourself.
This was one of many reasons we stopped using Hibernate, and I tend to agree with most of your other points as well. I have yet to meet a developer of business appliations that does not know the level of SQL that these tools can generate. Sure the syntax of OQL may be easier in some scenarios but most people still know what is going to be generated.
Maybe I've just been unlucky, but all of the shielding that ORM tools provide tend to just get in the way of doing any real work. They tend to be much like app/code generators; if what you need fits precisely into their view of things, then they are great tools, but venture out just a little bit and the frustration factor rises quickly.
> In general, you speak of things as they should be in an ideal world for a project that is being developed from scratch. There is in your statement, a tacit assumption of communication between the data modeller and the object designer which is reasonable for small to medium sized projects, but may not be possible if the database designer is part of a different part of the organization (as happens with strategic databases in large firms), or the database was developed it at a different time.
Boy does this ring true. Try getting some DBAs to build the database so that your objects work cleany is like pulling teeth. I can't recall the number of "discussions" I've had about having non-business primary keys, propagation of foreign keys and such topics to make our developers jobs easier (or to make the tool's job easier). Toss in federated databases, vendor provided dbs, stored procedures, views, etc. and most ORM solutions quickly lose their luster. I envy those that can use EJB3 or Hibernate or JDO for solving real business needs. Unfortunately my simplist applications won't work with these kinds of tools.
I too have written a persistence tool based on my experience with EJB / Hibernate / JDO / OJB, etc. and will have to look at Hydrate to see what you have done.
Re: Has the Objects/Relational Impedance Mismatch been Solved?
There is an important piece of the puzzle left out here. The reason you use HQL/EJB-QL is so you have query on objects, not on data. HQL/EJB-QL logically reside at the 'object' level - and if you have worked with it you'll recognize that it is inherently different what underlying model you are working with. You are defining queries on the object/entity layer you have defined, not on the rows/columns in the database directly. All of that mapping as to what columns are required, and all of the work of converting simple object queries into their respectively complex SQL is done for you.
When the other replier mentioned 'portable', I don't think the intention was to say that HQL itself was pluggable to different implementations, but that HQL acted as a translation layer allowing you to harness the more complex features of a particular RDBMS without dilluting your code with DB-specific SQL calls. I agree that ANSI subsets can be used to do quite a bit - but, there are some features (such as complex joins, subselects, and so forth) that if you can take advantage of them without tying yourself down, why not? The other advantage Hibernate brings here is that it downgrades gracefully - if it can't issue subselects, it doesn't fail, it just adapts the SQL it would output to account for that missing feature.
I'm not implying that ORMs are a panacea, but they can and do add benefits to many applications every day.
Re: Has the Objects/Relational Impedance Mismatch been Solved?
> There is an important piece of the puzzle left out
> here. The reason you use HQL/EJB-QL is so you have
> query on objects, not on data. HQL/EJB-QL logically
> reside at the 'object' level
This goes to the heart of the point about losing power of the relational model. The HQL/EJB-QL is not relational algebra, rather it generates relational operations based on what it has been told about your object model. This is what I mean by losing the power of the relational database.
The cost shows up when you want to view your relational data in a way that is best represented by another object model, or you want to combine information from other databases. As I mentioned in a previous reply, this is why I think there is merit in the
Hydrate
tool for certain applications. More in line with the approach taken by
iBatis
, it maps any SQL including vendor-specific dialects into a chosen object model.
Re: Has the Objects/Relational Impedance Mismatch been Solved?
> The cost shows up when you want to view your
> relational data in a way that is best represented by
> another object model, or you want to combine
> information from other databases. As I mentioned in
> a previous reply, this is why I think there is merit
> in the
> href="http://hydrate.sourceforge.net">Hydrate
> tool for certain applications. More in line with the
> approach taken by
> href="http://ibatis.apache.org/">iBatis
, it maps
> any SQL including vendor-specific dialects into a
> chosen object model.
Re: Has the Objects/Relational Impedance Mismatch been Solved?
> The reason you use HQL/EJB-QL is so you have
> query on objects, not on data.
Since when is data not objects? Strings, integers, dates - they are all objects.
> HQL/EJB-QL logically
> reside at the 'object' level - and if you have worked
> with it you'll recognize that it is inherently
> different what underlying model you are working with.
Yes, I have worked this way for three years and in 95% of the scenarios the 'object' level and database level are identical. One table -> one class, one record -> one object. This is the basic rule in most enterprise applications.
> You are defining queries on the object/entity layer
> you have defined, not on the rows/columns in the
> database directly. All of that mapping as to what
> columns are required, and all of the work of
> converting simple object queries into their
> respectively complex SQL is done for you.
Do you have some examples where the HQL/EJBQL query is much simpler than the corresponding SQL query?
> there are some features (such as
> complex joins, subselects, and so forth) that if you
> can take advantage of them without tying yourself
> down, why not?
How good are the support for complex joins and subselects in HQL or EJBQL? You don't sacrifice very much if you skip using some of these SQL features in favour of RDBMS vendor portability.
Re: Has the Objects/Relational Impedance Mismatch been Solved?
> Since when is data not objects? Strings, integers,
> dates - they are all objects.
Well sure, but they are effectively primitives. Hibernate moves away from the standard database types, into your object's model - which is complex - which in this case I use to mean having complex associations with other objects. Hibernate understands that there are relationships between entities, and HQL has very expressive support for that.
> Yes, I have worked this way for three years and in
> 95% of the scenarios the 'object' level and database
> level are identical. One table -> one class, one
> record -> one object. This is the basic rule in most
> enterprise applications.
You have to admit this gets complicated when you need to load object associations (e.g. with joining).
> Do you have some examples where the HQL/EJBQL query
> is much simpler than the corresponding SQL query?
I'm so glad you asked. Here are a couple for starters (quoting from the Hibernate documentation)
from Cat cat where cat.mate.name is not null
This query translates to an SQL query with a table (inner) join. If you were to write something like
from Foo foo
where foo.bar.baz.customer.address.city is not null
you would end up with a query that would require four table joins in SQL.
This is just a couple of examples of where you are working with *complex* object relationships - associations between your objects beyond just the raw primitives.
> How good are the support for complex joins and
> subselects in HQL or EJBQL? You don't sacrifice very
> much if you skip using some of these SQL features in
> favour of RDBMS vendor portability.
Hibernate has very extensive support for complex joins and subselects to be performed manually (see the documentation here:
Joining
, and
Sub-queries
) even going beyond the natural support in SQL by being able to compose multiple selects to meet your needs (in most cases it can compose very complex single SQL statements however) - but the key to remember is that a lot of times HQL can perform joins and subselects for you, without you having to concern yourself with the fact it is joining. Joining is just one way that Hibernate can populate object associations - it can also populate those associations lazily using optimized subselects and so forth. The joining is covered in the core Hibernate documentation well, I have covered the subselects here at Javalobby:
Hibernate: Understanding Lazy Fetching
Hibernate: Tuning Lazy Fetching
It's hard to quantify how much you skip if you sacrifice those joining features to have RDBMS vendor portability, so I can't argue concretely with you there, however I will say that in my experience being able to use them has been a good thing.
Re: Has the Objects/Relational Impedance Mismatch been Solved?
> Well sure, but they are effectively primitives.
> Hibernate moves away from the standard database
> types, into your object's model - which is complex -
> which in this case I use to mean having complex
> associations with other objects. Hibernate
> understands that there are relationships between
> entities, and HQL has very expressive support for
> that.
Tables, columns, records and foreign keys can be classes/object too. A RDBMS understands relationships too.
> You have to admit this gets complicated when you need
> to load object associations (e.g. with joining).
Why?
>
>
> from Cat cat where cat.mate.name is not null
>
> This query translates to an SQL query with a table
> (inner) join. If you were to write something like
>
>
> from Foo foo
> where foo.bar.baz.customer.address.city is not null
>
>
> you would end up with a query that would require four
> table joins in SQL.
>
select *
from cat
join mate on mate.mateid=cat.mateid
where mate.name is not null
select *
from foo
join bar on foo.barid=bar.id
join baz on bar.bazid=baz.id
join customer on baz.customerid=customer.id
join address on customer.addressid=address.id
where address.city is not null
I don't think there are any significant difference here. But I agree that SQL should be extended so the join criteria may be skipped, if you are joining two tables that only have one relation between them.
> It's hard to quantify how much you skip if you
> sacrifice those joining features to have RDBMS vendor
> portability, so I can't argue concretely with you
> there, however I will say that in my experience being
> able to use them has been a good thing.
Actually I have never encountered a join or subselect scenario that are not compatible between the major RDBMS vendors. Maybe you can give some examples when this is a problem?
Has the Objects/Relational Impedance Mismatch been Solved?
At 3:19 PM on Nov 20, 2005, David Chamberlin wrote:
Fresh Jobs for Developers Post a job opportunity
At last with JDO and EJB, with Hibernate and TopLink, with Kodo and OJB, there is no longer any need to write SQL, or even use a relational database. So where's the catch?
Relational databases originally won out over hierarchical or file-based databases not because they were more performant, more efficient or had better support, but because of the qualities of the relational model, which has since the early days geen accessed through the (arguably broken) Structured Query Language (SQL). SQL is a victim of its own history and success. It is overly verbose. It lacks symmetry. It struggles to be effective without resorting to extensions to the agreed standard. It's in a different league compared to Java and C#, for example. These object oriented languages are so much more clean and exact in the way they represent and manipulate data as well as being infinitely more flexible and frankly, let's face it, more beautiful.
The fact is that this beauty hides a mess of ugliness beneath, whereas the ugliness of SQL hides a quiet beauty that is not even noticed by most casual writers of select statements.
So what is object-oriented programming's dark secret? Simply put, the strength of OO derives from the fact that it generates structures in programming languages that are akin to the way human minds happen to process information. We love to classify things, put boxes around them, create abstractions where a complex concept can be treated in a way that doesn't need to include the details. We use these structures very effectively: we can classify all of the organisms on earth into Kingdoms, Families and Species: one huge inheritance hierarchy of evolution. We can simultaneously understand the nature of an aeroplane as being like a bus (except that it flies) or like a bird (except that it's man-made, and bigger), both being different abstract ways of understanding the same object. And object oriented programming helps us all the way: for every abstraction we can think of, there is an interface. For every 'thing' that exists in our simulated world, there is an object class. This is 'common sense injection' and we can use it to structure our very code, so that things that don't make sense can't even be expressed in our language.
There is a problem however, with common sense injection, and it is this. This approach to modelling systems is only one possible way of doing things, which doesn't work very well in many cases. Einstein did not develop special relativity by thinking about space an time using his common sense, rather he took a simple observation (that light needs no frame of reference to limit its speed) and then used mathematics to derive the unavoidable, but distinctly non-common sense conclusions. Neither did Keynes come up with his ideas about economics by thinking in accepted common sense terms. Sometimes our common sense does not permit us to reach a deeper understand of an underlying model. The reason for us organizing our thoughts in the way we do has troubled philosophers since Plato , but the fact is, that despite two and a half centuries of effort, nobody has come up with a convincing explanation. There is nothing intrinsically correct about organizing the world in this way, and modelling programs as such is only useful because that's how most people's minds work. Formal, correct and complete as it may seem, object oriented programming has its basis in arbitrary human behaviour.
SQL is different. Behind SQL, and somewhat obscured by its clunky syntax, sits a body of number theory called relational calculus which, like the rules that tell us what to do when adding, subtracting or multiplying two numbers, provide us with a set of operations we can apply to 'relations' which equate roughly to tables in a relational database. SQL as a language evolved as a de facto standard in the era of COBOL and has been slow to modernize its syntax and clean up its idiosyncrasies because, to be honest, it isn't in the commercial interests of any of the database vendors to do it. There has been agonizingly slow progress with the ANSI process and now at least some of the features introduced by vendors in the early 90's are starting to become available through this standard - at the same time as the vendors are further extending their database's capabilities. Of course there have long been calls for a proper symbolic relational calculus as an agreed standard to access relational databases, but that's another story.
The bottom line is that the relational model is a formal mathematical representation for data. Mathematical abstractions are useful because they help us to solve problems that common sense finds difficult to access.
Alice starts north from Abbotsville going at 50mph and Bob starts from Boxford at 40mph at the same moment. Boxford is 20miles directly north of Abbotsville. Ignoring relativistic effects, and assuming both travel in a straight line north, how far from Boxford are they when Alice passes Bob.
Go on, solve it, you know you want to. The fact is, most people solve this kind of problem by writing down an equation and solving the equation. While solving the mathematics, they are not thinking about what the variable in the equation represent, they are going through a logical process to reach an answer that is difficult to access using a common sense approach (and even harder if you do take relativity into account).
Relational databases are a formal mathematical representation that is useful when solving data problems. It so happens that SQL is the 'standard' for accessing the relational algebra needed to manipulate this relational data. This highlights the true impedance mismatch between object oriented programming and relational databases. The former is an arbitrary, but common sense model of the world wheras the latter is a formal mathematical abstraction of the data. The problem is not helped by the idiosyncrasies and dialects of SQL, but this fact actually hides the true nature of what is required to reconcile these two views.
So much yada yada, so where are the examples? The problem with coming up with approaches to dealing with complex systems, is that the simple examples don't really shed much light on the problem. However, let's give it a go. Suppose we are building a stock tracking system. We have stock items that have a stock index, a colour, a shape and a price. We have clients in the database with a name, address and telephone, and purchase records that record which clients bought which stock items for which price.
The system is pure common sense, there are 'person' things representing clients, 'stockitem' things representing lines of stock, and 'purchase' items representing a transaction. You have implemented this system in an object oriented language, using an object oriented, XML, or other non-relational database. Everything has been working well for years, and you little company has gone from strength to strength, reaping in the profits. So well, in fact, that you decide to buy out another company and leverage your loyal client base. You now have a new requirement for your little database for doing cross selling:
What has happened here is that the common sense injection in your original application has let you down. This application had a single purpose: keeping stock records and recording client addresses and payments. Now you want to use it for data mining: a purpose for which it wasn't designed, and you end up having to do a major rewrite to have it answer these simple questions. Had the application been created with a relational database, these questions would be very simple to answer, though it might be difficult, particularly with a real-world problem, to map the answers into your object model using an Object/Relational mapping layer that does not give you full access to the underlying SQL.
So moving back to the original question: has the impedance problem been solved? The risk is that many solutions that map object data to relational databases and vice versa sacrifice the power of the relational model in doing so. By eliminating SQL, or offering a solution that is agnostic as to whether it is using a relational, object or XML database under the covers, you lose the flexibility to change your mind, in the future, about what you want your database to do. And that's got to be a bad thing.
21 replies so far (
Post your own)
Re: Has the Objects/Relational Impedance Mismatch been Solved?
> >So moving back to the original question: has the > impedance problem been solved? The risk is that many > solutions that map object data to relational > databases and vice versa sacrifice the power of the > relational model in doing so. By eliminating SQL, or > offering a solution that is agnostic as to whether it > is using a relational, object or XML database under > the covers, you lose the flexibility to change your > mind, in the future, about what you want your > database to do. And that's got to be a bad thing.
I don't think I agree with your statement that solutions that map object data to relational databases sacrifice the power of the relational model. It seems to me that we get the best of both worlds. We can write code that uses objects (which makes for code that is easier to understand and easier to maintain). And we can fall back to SQL for reporting, answering questions, and data mining. Not sure I see a problem here. Except that I don't like SQL and wish there were more a better, more widely supported alternative for using relational calculus.Re: Has the Objects/Relational Impedance Mismatch been Solved?
> These object oriented languages are so much more clean and> exact in the way they represent and manipulate data as well
> as being infinitely more flexible and frankly, let's face
> it, more beautiful.
Huh!? Really? This is a very, very bold claim to make without offering any evidence or argument for it. Show some examples of how an OO language offers more cleaner or more flexible data manipulation.
> Had the application been created with a relational
> database, these questions would be very simple to
> answer, though it might be difficult, particularly with
> a real-world problem, to map the answers into your
> object model using an Object/Relational mapping layer
> that does not give you full access to the underlying SQL.
This just not true. There is no problem expressing these queries in an OO query language like HQL / EJB-QL.
> So moving back to the original question: has the
> impedance problem been solved? The risk is that many
> solutions that map object data to relational databases
> and vice versa sacrifice the power of the relational
> model in doing so. By eliminating SQL, or offering a
> solution that is agnostic as to whether it is using a
> relational, object or XML database under the covers, you
> lose the flexibility to change your mind, in the future, > about what you want your database to do. And that's got
> to be a bad thing.
None of this is true. The whole point of ORM is to allow you to have your cake and eat it. Your data modeller can create a correct relational model. You can map that to a convenient object model. An OO query language like HQL/EJB-QL is so similar to SQL that you lose nothing. Solutions like EJB3 persistence and Hibernate are NOT data model agnostic, rather they are targetted directly at relational databases.
Frankly, it sounds like you have never actually used a modern ORM solution.
Re: Has the Objects/Relational Impedance Mismatch been Solved?
Hello DavidObviously there are situations where O/R mapping tools are not suited, and where direct manual database manipulation is required.
But in most cases O/R mapping tools are able to deal with specifics or the underlying data sources, and they now provide a lot of tuning options.
Most of your concerns are related to first generation of O/R mapping tools, in the 90s. Most modern products have dropped such limitations. Besides this modern standards like JDO 2 and EJB 3 support native SQL queries and mix of O/R mapping and direct JDBC. Maybe you didn't play with an O/R mapping product since a long time, or you don't know them very well in terms of internal design.
Best Regards, Eric.
Re: Has the Objects/Relational Impedance Mismatch been Solved?
> An OO query language like HQL/EJB-QL is so similar to SQL > that you lose nothing.If HQL/EJB-QL is are so similar, why inventing a new query language? Why not use the existing standard query language?
Fredrik Bertilsson
http://butler.sourceforge.net
Re: Has the Objects/Relational Impedance Mismatch been Solved?
Gavin,
Thanks for your feedback, I agree with much but not all.
> > These object oriented languages are so much more clean and
> > exact in the way they represent and manipulate data as well
> > as being infinitely more flexible and frankly, let's face
> > it, more beautiful.
>
> Huh!? Really? This is a very, very bold claim to make
> without offering any evidence or argument for it.
> Show some examples of how an OO language offers more
> cleaner or more flexible data manipulation.
>
This was intended as an ironic jab at anyone who might deny that relational is a useful model.
> > Had the application been created with a relational
> > database, these questions would be very simple to
> > answer, though it might be difficult, particularly with
> > a real-world problem, to map the answers into your
> > object model using an Object/Relational mapping layer
> > that does not give you full access to the underlying SQL.
>
> This just not true. There is no problem expressing
> these queries in an OO query language like HQL /
> EJB-QL.
>
The illustration was really to highlight the shortcomings of the pure object-oriented database approach. HQL / EJB-QL work well for this simple case.
> > So moving back to the original question: has the
> > impedance problem been solved? The risk is that many
> > solutions that map object data to relational databases
> > and vice versa sacrifice the power of the relational
> > model in doing so. By eliminating SQL, or offering a
> > solution that is agnostic as to whether it is using a
> > relational, object or XML database under the covers, you
> > lose the flexibility to change your mind, in the future,
> > about what you want your database to do. And that's got
> > to be a bad thing.
>
> None of this is true. The whole point of ORM is to
> allow you to have your cake and eat it. Your data
> modeller can create a correct relational model. You
> can map that to a convenient object model. An OO
> query language like HQL/EJB-QL is so similar to SQL
> that you lose nothing. Solutions like EJB3
> persistence and Hibernate are NOT data model
> agnostic, rather they are targetted directly at
> relational databases.
>
I would have to disagree with this last statement for a number of reasons.
For example, consider a database that stores information about financial products. There are tens of different identifiers that are used to reference financial products (CUSIP, SEDOL, ISIN, RIC, Ticker, etc.), some of which are better for different markets or different instrument types. A strategic database employs a denormalized form of this data where a 'Product' and 'Identifiers' are stored in separate tables, and each identifier has a reference to the product, a type and a value.
Meanwhile, a domain-specific user of this data might be operating in a market where she only cares about two possible identifiers and for convenience wants to work with an object model over this data that contains the two chosen identifiers embedded directly as attributes of the product. It is quite simple to write the SQL to perform this translation, but what is the overhead of maintaining two different object models over the same data?
> Frankly, it sounds like you have never actually used
> a modern ORM solution.
In fact, I have written a modern ORM solution (Hydrate) which I think takes a pretty fresh approach to this problem that is worth considering. Hydrate is open source, btw. In a separate project I have made strong use of Hibernate and it is in this context that this post was made.
Thanks again for you comments - looks like I misrepresented myself as an ORM luddite which I'm clearly not, just someone who thinks that there is always scope for improving already-great products. Take a look at Hydrate if you have a moment - I can't think of a better person to get feedback from
Re: Has the Objects/Relational Impedance Mismatch been Solved?
Well, to begin with SQL looks more like a language family rather than a single language. The standard has statements like 'There shall be some functions to manipulate strings.'. Hardly something that leads to a robust and portable solution.And then what about using an outer join in a query. The various SQL implementations vary tremendously here. So, why not use SQL - because SQL isn't a language, it's a family of somewhat similar languages.
HQL/EJBQL are portable (far more than could be said for SQL) whilst not being that disimilar to SQL.
Re: Has the Objects/Relational Impedance Mismatch been Solved?
> HQL/EJBQL are portable (far more than could be said> for SQL) whilst not being that disimilar to SQL.
HQL is portable between Hibernate and ...? EJBQL might be more portable, but the main reason is the very small number of products supporting EJBQL.
As a matter of fact, I am working with a large enterprise application that are using (a subset of) ANSI SQL, which makes the application portable between a number of different RDBMS vendors. It is not difficult to make a portable SQL application. For which vendors do you have problems with portability with joins?
Fredrik Bertilsson
http://butler.sourceforge.net
Re: Has the Objects/Relational Impedance Mismatch been Solved?
> From the developer's standpoint, it can sometimes feel like you are coaxing the tool into writing SQL that you know perfectly well how to write yourself.This was one of many reasons we stopped using Hibernate, and I tend to agree with most of your other points as well. I have yet to meet a developer of business appliations that does not know the level of SQL that these tools can generate. Sure the syntax of OQL may be easier in some scenarios but most people still know what is going to be generated.
Maybe I've just been unlucky, but all of the shielding that ORM tools provide tend to just get in the way of doing any real work. They tend to be much like app/code generators; if what you need fits precisely into their view of things, then they are great tools, but venture out just a little bit and the frustration factor rises quickly.
> In general, you speak of things as they should be in an ideal world for a project that is being developed from scratch. There is in your statement, a tacit assumption of communication between the data modeller and the object designer which is reasonable for small to medium sized projects, but may not be possible if the database designer is part of a different part of the organization (as happens with strategic databases in large firms), or the database was developed it at a different time.
Boy does this ring true. Try getting some DBAs to build the database so that your objects work cleany is like pulling teeth. I can't recall the number of "discussions" I've had about having non-business primary keys, propagation of foreign keys and such topics to make our developers jobs easier (or to make the tool's job easier). Toss in federated databases, vendor provided dbs, stored procedures, views, etc. and most ORM solutions quickly lose their luster. I envy those that can use EJB3 or Hibernate or JDO for solving real business needs. Unfortunately my simplist applications won't work with these kinds of tools.
I too have written a persistence tool based on my experience with EJB / Hibernate / JDO / OJB, etc. and will have to look at Hydrate to see what you have done.
Re: Has the Objects/Relational Impedance Mismatch been Solved?
There is an important piece of the puzzle left out here. The reason you use HQL/EJB-QL is so you have query on objects, not on data. HQL/EJB-QL logically reside at the 'object' level - and if you have worked with it you'll recognize that it is inherently different what underlying model you are working with. You are defining queries on the object/entity layer you have defined, not on the rows/columns in the database directly. All of that mapping as to what columns are required, and all of the work of converting simple object queries into their respectively complex SQL is done for you.When the other replier mentioned 'portable', I don't think the intention was to say that HQL itself was pluggable to different implementations, but that HQL acted as a translation layer allowing you to harness the more complex features of a particular RDBMS without dilluting your code with DB-specific SQL calls. I agree that ANSI subsets can be used to do quite a bit - but, there are some features (such as complex joins, subselects, and so forth) that if you can take advantage of them without tying yourself down, why not? The other advantage Hibernate brings here is that it downgrades gracefully - if it can't issue subselects, it doesn't fail, it just adapts the SQL it would output to account for that missing feature.
I'm not implying that ORMs are a panacea, but they can and do add benefits to many applications every day.
Re: Has the Objects/Relational Impedance Mismatch been Solved?
> There is an important piece of the puzzle left out> here. The reason you use HQL/EJB-QL is so you have
> query on objects, not on data. HQL/EJB-QL logically
> reside at the 'object' level
This goes to the heart of the point about losing power of the relational model. The HQL/EJB-QL is not relational algebra, rather it generates relational operations based on what it has been told about your object model. This is what I mean by losing the power of the relational database.
The cost shows up when you want to view your relational data in a way that is best represented by another object model, or you want to combine information from other databases. As I mentioned in a previous reply, this is why I think there is merit in the Hydrate tool for certain applications. More in line with the approach taken by iBatis , it maps any SQL including vendor-specific dialects into a chosen object model.
Re: Has the Objects/Relational Impedance Mismatch been Solved?
> The cost shows up when you want to view your> relational data in a way that is best represented by
> another object model, or you want to combine
> information from other databases. As I mentioned in
> a previous reply, this is why I think there is merit
> in the > href="http://hydrate.sourceforge.net">Hydrate
> tool for certain applications. More in line with the
> approach taken by > href="http://ibatis.apache.org/">iBatis , it maps
> any SQL including vendor-specific dialects into a
> chosen object model.
Nothing prevents you from getting out Map object from HQL queries, or view objects, or as a plain Object[] (see http://www.hibernate.org/hib_docs/v3/reference/en/html/queryhql.html#queryhql-select).
What do you lack?
Re: Has the Objects/Relational Impedance Mismatch been Solved?
> The reason you use HQL/EJB-QL is so you have> query on objects, not on data.
Since when is data not objects? Strings, integers, dates - they are all objects.
> HQL/EJB-QL logically
> reside at the 'object' level - and if you have worked
> with it you'll recognize that it is inherently
> different what underlying model you are working with.
Yes, I have worked this way for three years and in 95% of the scenarios the 'object' level and database level are identical. One table -> one class, one record -> one object. This is the basic rule in most enterprise applications.
> You are defining queries on the object/entity layer
> you have defined, not on the rows/columns in the
> database directly. All of that mapping as to what
> columns are required, and all of the work of
> converting simple object queries into their
> respectively complex SQL is done for you.
Do you have some examples where the HQL/EJBQL query is much simpler than the corresponding SQL query?
> there are some features (such as
> complex joins, subselects, and so forth) that if you
> can take advantage of them without tying yourself
> down, why not?
How good are the support for complex joins and subselects in HQL or EJBQL? You don't sacrifice very much if you skip using some of these SQL features in favour of RDBMS vendor portability.
Fredrik Bertilsson
http://butler.sourceforge.net
Re: Has the Objects/Relational Impedance Mismatch been Solved?
> Since when is data not objects? Strings, integers,> dates - they are all objects.
Well sure, but they are effectively primitives. Hibernate moves away from the standard database types, into your object's model - which is complex - which in this case I use to mean having complex associations with other objects. Hibernate understands that there are relationships between entities, and HQL has very expressive support for that.
> Yes, I have worked this way for three years and in
> 95% of the scenarios the 'object' level and database
> level are identical. One table -> one class, one
> record -> one object. This is the basic rule in most
> enterprise applications.
You have to admit this gets complicated when you need to load object associations (e.g. with joining).
> Do you have some examples where the HQL/EJBQL query
> is much simpler than the corresponding SQL query?
I'm so glad you asked. Here are a couple for starters (quoting from the Hibernate documentation)
This is just a couple of examples of where you are working with *complex* object relationships - associations between your objects beyond just the raw primitives.
> How good are the support for complex joins and
> subselects in HQL or EJBQL? You don't sacrifice very
> much if you skip using some of these SQL features in
> favour of RDBMS vendor portability.
Hibernate has very extensive support for complex joins and subselects to be performed manually (see the documentation here: Joining , and Sub-queries ) even going beyond the natural support in SQL by being able to compose multiple selects to meet your needs (in most cases it can compose very complex single SQL statements however) - but the key to remember is that a lot of times HQL can perform joins and subselects for you, without you having to concern yourself with the fact it is joining. Joining is just one way that Hibernate can populate object associations - it can also populate those associations lazily using optimized subselects and so forth. The joining is covered in the core Hibernate documentation well, I have covered the subselects here at Javalobby:
Hibernate: Understanding Lazy Fetching
Hibernate: Tuning Lazy Fetching
It's hard to quantify how much you skip if you sacrifice those joining features to have RDBMS vendor portability, so I can't argue concretely with you there, however I will say that in my experience being able to use them has been a good thing.
Regards,
Re: Has the Objects/Relational Impedance Mismatch been Solved?
> Well sure, but they are effectively primitives.> Hibernate moves away from the standard database
> types, into your object's model - which is complex -
> which in this case I use to mean having complex
> associations with other objects. Hibernate
> understands that there are relationships between
> entities, and HQL has very expressive support for
> that.
Tables, columns, records and foreign keys can be classes/object too. A RDBMS understands relationships too.
> You have to admit this gets complicated when you need
> to load object associations (e.g. with joining).
Why?
>
I don't think there are any significant difference here. But I agree that SQL should be extended so the join criteria may be skipped, if you are joining two tables that only have one relation between them.
> It's hard to quantify how much you skip if you
> sacrifice those joining features to have RDBMS vendor
> portability, so I can't argue concretely with you
> there, however I will say that in my experience being
> able to use them has been a good thing.
Actually I have never encountered a join or subselect scenario that are not compatible between the major RDBMS vendors. Maybe you can give some examples when this is a problem?
Fredrik Bertilsson
http://butler.sourceforge.net