points by terhechte 11 years ago

I used to write raw SQL for many years, then, around 2005 switched over to ORMs in order to be able to target different databases, have a nice model, etc. Lets be honest here, the ease of justing doing:

    p.username = "Carl"
    p.age = 33
    p.save

instead of "update users set username=:username, age=:age where id=:id" has a ton of advantages. For one, some sort of syntax or type checker is actually trying to understand your queries and makes it easy to find typos before the database laments in the middle of a huge transaction. Strongly typed languages are even cooler here (most notably Slick for Scala, which has a fully type-checked DSL for database querying which makes it really difficult to create typos) [1].

However, the downside of this is that it only works as long as whatever you need from your database is the lowest common denominator of database features. I'm currently working on a Clojure + Postgres project and I'm using all the bells & whistles that Postgres has to offer: HStore types, Json Types, subqueries in subqueries, Upserts, functions, etc. I have a rather complex database that I'm running analytical queries against, and using an ORM for this would simply not work. I'd spend half my time trying to figure out how to implement a certain Postgres feature in -insert-or-name-here. I actually started with Korma [2], a simple Clojure ORM, and gave up because it was too tedious trying to figure out how to get it to correctly run my 4-function column result query.

I've since switched to YeSQL [3] which follows a very interesting idea: You write queries in .sql files (full editor support, jay) and tag every query with a name. YeSQL then reads these .sql files during compile time and dynamically generates clojure functions with the correct amount of parameters based on these queries. It is kinda the best of both worlds. Example:

queries.sql:

    -- name: get-users
    -- a function to retrieve the user by name
    select * from users where username=:name

clojure.clj:

    (defqueries "some/where/queries.sql")

    ;; now I can do:
    (get-users "carl")

I think it really depends on the use case. If the database model is simple and there're no crazy database technologies in use, I'd rather go and use an ORM again I guess.

[1] http://slick.typesafe.com/ [2] http://sqlkorma.com/ [3] https://github.com/krisajenkins/yesql

thom 11 years ago

YeSQL seems a little bit like it's reinventing the Microsoft data access ecosystem of 10-15 years ago - stored procedures behind a code-generated API. Retro is cool.

The Clojure SQL ecosystem is weird. clojureql seemed wonderful for a while, but has been left to rot and the fact that nobody's really picked it up implies people have just moved on with their lives. Korma and YeSQL seem to handle most of the Rails-like use cases, and I guess everyone else has moved to more esoteric datastores.

  • davedx 11 years ago

    Except stored procedures have to be stored in the database AND usually end up in source control too. At least with this library there is one source for your queries. Also, they are compiled into your language as first class functions, which I like too... kind of like how JSON is a first class citizen in JavaScript.

    I really like the approach.

  • casperc 11 years ago

    I think alot of people (including myself and my team) just tend to get on with their lives and use clojure/java.jdbc. Sure it's not that sexy, but it gets the job done.

    And for the 80-90% queries you can create a small function or namespace with a nice API and converts to the types you like.

Silhouette 11 years ago

However, the downside of this is that it only works as long as whatever you need from your database is the lowest common denominator of database features.

This can be an overriding practical objection to every ORM I've encountered so far.

To give an example I've run into several times, Postgres offers several levels of transaction isolation. The more isolated levels offer stronger guarantees, but you also need to be able to recover and retry if a transactions fails a serialization condition the first time. These levels determine how interactions work with related tools like explicit locking, SELECT FOR UPDATE, and so on.

In least common denominator ORM world, you're lucky if you get any serious control over this kind of thing at all. If you actually have a use case that requires precision here -- and sometimes you do even in surprisingly simple use cases, such as needing to allocate new IDs in an increasing, guaranteed contiguous sequence in Postgres -- then this stuff matters.

Writing longhand SQL queries is a pain for several obvious reasons, but it's still better than using an ORM and finding it doesn't support a feature you need or, worse, it does things implicitly and sometimes gets them wrong.

dharbin 11 years ago

I'm loving all the momentum towards writing templated-sql, in fact, I wrote a library for this myself[1].

By leveraging jinja2/django-style template inheritance, you can even bring some advantages of ORMs (composition, reuse, and extending) into the raw-sql world.

The OP also intimated that he's taking a templated approach:

"“In these cases, I've elected to write queries using a templating system and describe the tables using the ORM. I get the convenience of an application level description of the table with direct use of SQL. It's a lot less trouble than anything else I've used so far.”

[1] https://github.com/civitaslearning/swigql

  • jdotjdot 11 years ago

    If you're at all interested in opening a kick starter for such a templating library for Django, I'd back it. I have attribute creep all the time and actually generally prefer raw SQL with the exception of its verbosity. The problem is, migrations are awful and SQL injection mistakes easy to come by. Would be great to have the best of both worlds in a SQL templating engine + sort-of ORM wrapper that auto-generates via SQL inspection

    • kyllo 11 years ago

      Without the ORM, though, what would be the point of using Django? To me it seems like this would be a better fit for a more minimalist platform like Flask.

    • crazytony 11 years ago

      not sure I follow? Most sql templates I have seen behave like Django's .raw() function (which inspects the resultset and auto binds to the object)

spacemanmatt 11 years ago

I'm slightly envious that you are working on a Clojure project with PostegrSQL, especially involving all the bells and whistles. Got any PostGIS or otherwise geospatial data, on top of all that? :)

  • terhechte 11 years ago

    Well, it is my own project, so I got to choose the technologies :) It is a lot of fun working on it. The data does have location information, but I'm not sure if I'll use PostGIS for it, as it would be a bit of taking a sledgehammer to crack a nut. The location information is rather sparse.

    • baudehlo 11 years ago

      Most people who think they need postgis just need earthdistance. Try it.

Fishkins 11 years ago

Thanks for sharing your experience. I've been meaning to try Slick, and YeSQL sounds like a nice way to reduce some boilerplate with no real downside. I go back and forth about how I feel about ORMs. I think everyone can agree you'll need to learn SQL for any non-trivial project, even if you end up using some abstraction on top of it.

On a tangent: you mentioned Upserts in Postgres features. I thought Postgres didn't have any kind of Upsert. Was it added recently or something?

  • Xorlev 11 years ago

    Postgres still doesn't. I usually write a rule to do so. For instance, here's my "ON DUPLICATE KEY UPDATE eid=eid":

      CREATE RULE location_updates_on_duplicate_ignore AS
        ON INSERT TO location_updates
        WHERE (EXISTS (SELECT 1 FROM location_updates WHERE (eid = NEW.eid))) 
        DO INSTEAD NOTHING;
    

    But you can also do much more advanced merge logic by replacing DO INSTEAD NOTHING with DO INSTEAD <statement>. Mine just needed to ignore already-submitted batches.

  • terhechte 11 years ago

    Postgres indeed doesn't have Upsert yet, so I'm going the default way of locking the table, and implementing it via a slightly more complex query. I was just too lazy to explain that in my earlier comment. The problem is the same: The syntax below can't really be represented well in a ORM.

        BEGIN;
        LOCK TABLE search_tracking IN SHARE ROW EXCLUSIVE MODE;
        WITH upsert AS (UPDATE search_tracking SET count=count+1 WHERE keyword = 'John Doe' RETURNING *) INSERT INTO search_tracking (keyword, count) SELECT 'John Doe', 1 WHERE NOT EXISTS(SELECT * FROM upsert);
        COMMIT;
  • ssmoot 11 years ago

    Side Note: Slick has basic parameterized queries/statements/DDL.

    eg:

      val cachedPurchasesQuery = Q[String, (Int, AccountId, Timestamp)] + """
                                                                              |SELECT pi.purchasable_item_id, u.account_id, pi.created_at
                                                                              |FROM purchased_items AS pi
                                                                              |INNER JOIN users u ON pi.user_id = u.id
                                                                              |WHERE purchasable_item_type = ? AND u.account_id IS NOT NULL
                                                                            """.stripMargin
    

    Then you call that like:

      val magazinePurchases = cachedPurchasesQuery("Magazine").list
    

    Or `foreach`, `firstOption`, etc. Most of the normal collection-y stuff.

    I prefer this over the Table mapping DSL and for-comprehension stuff personally. But I've only used it in production on smaller projects that are limited to under a dozen queries/statements or so.

ddispaltro 11 years ago

I really like scalikejdbc [1] over slick especially using the parameterized queries. They handle it more elegantly than slick which even for Scala has a pretty meteoric learning curve. I also really like how I don't need specific dialect to get it working on newer restricted sql db's/olap systems like phoenix [2] or presto [3].

[1] http://scalikejdbc.org/ [2] http://phoenix.apache.org/ [3] http://prestodb.io/

Joeri 11 years ago

Writing sql by hand doesn't have to mean you abandon things like autocomplete and automatic highlighting of typo's. SQL can be inspected by a proper ide just like any other language.

  • estebank 11 years ago

    To be fair, SQL has a syntax that is hard to provide (for example) autocompletion for, as the table comes after the fields, and the field names can be ambiguous.

    • staz 11 years ago

      Postgresql command line (psql) still manage to do an excellent job at it (but I concede it's pretty hard to replicate)

    • Zergy 11 years ago

      It seems this would be bypassed by letting the IDE hit the DB to fetch table information.

drbawb 11 years ago

>Strongly typed lanaguages are even cooler here ...

No expereince with Slick in particular; but I've been using jOOq[1] which I believe is similar.

To be honest I'm not entirely sold that these DSLs are what I'd consider "strongly typed." I can get jOOq to pretty easily yield queries that won't work if I switch out database dialects. (Ignoring, for a moment, that jOOq will let you embed SQL fragments as strings.)

As an example: jOOq will happily let you write an update query targeting a table bound to an alias. This type checks just fine. In fact this query will even work in Postgres; but it yields a syntax error from the database if you target SQL Server 2008 instead.

Then there's the issue of the dialects letting you use features the RDBMS doesn't support. For example you can build a merge statement regardless of which dialect is selected. If you were to select the Postgres dialect you'll get an "unsupported exception" at runtime.

If a method is RDBMS specific (in practice) then why is it part of the generic API?

---

My biggest beef with all these SQL abstraction layers is that many of them claim to be "write once, run anywhere." In practice I've just never seen that to be true.

I wish they would incorporate some sort of "capabilities" system that could run at compile time. Just imagine: when you swap in the Postgres driver your IDE throws a little red squiggle under the `.merge()` call, "method not found."

[1]: http://www.jooq.org/

  • lukaseder 11 years ago

    > To be honest I'm not entirely sold that these DSLs are what I'd consider "strongly typed."

    Absolutely! They're "quite" typesafe, much more than string-based SQL. Much less than actual compiled stored procedures.

    > As an example: jOOq will happily let you write an update query targeting a table bound to an alias.

    Yes, that currently cannot be detected.

    > Then there's the issue of the dialects letting you use features the RDBMS doesn't support

    That will be addressed in the near future when we implement an API "preprocessor" that will effectively remove all parts of the API that are not supported by your given dialect. We'll also distinguish between native support and emulated support, if this strictness matters in your application.

    The relevant issue is here: https://github.com/jOOQ/jOOQ/issues/720

    > In practice I've just never seen that to be true.

    True, but if you need "write once, run anywhere", you're probably much better off with a headstart than if you started from scratch.

  • twic 11 years ago

    > To be honest I'm not entirely sold that these DSLs are what I'd consider "strongly typed." I can get jOOq to pretty easily yield queries that won't work if I switch out database dialects.

    I can get Hibernate to pretty easily yield queries that won't work if I use a mildly unusual combination of JPA features. weeps

  • lukaseder 11 years ago

    > I can get jOOq to pretty easily yield queries that won't work if I switch out database dialects.

    In fact, you can do this with SQL. The following is perfectly valid in MySQL but not in any of the other dialects:

        SELECT a, b, COUNT(*)
        FROM t
        GROUP BY a
    

    SQL is so heavy with implicit semantics that cannot be expressed in syntax alone...

dinkumthinkum 11 years ago

Let's just be honest, ORMs add a lot more weight than just "p.username = 'Carl'". You do not always have a type checker, besides, there are unit tests. It is not just masochistic to use pure SQL.

beagle3 11 years ago

web2py gets a lot of hate from some parts of the python community (with Ronacher & Moss-Kaplan being the most prominent examples), but its DAL (Database Abstraction Layer) feels like the right way to approach this: It's not an ORM, but it gives you a lot of what ORM gives you; It's not directly SQL, but the mapping between DAL code and underlying SQL is almost trivial (though it might depend on the specific DB). It just works well, and can be used independently of web2py.

It will even do database migrations for you if the schema evolution is reasonably simple.

NicoJuicy 11 years ago

I actually like this a lot... This would replace my whole Service, Data and Repository layer in Asp.Net MVC (DDD pattern).

I suppose i could execute this with EF ( http://goo.gl/yrpver Stored Procedure mapping) and this way, i can still map my table in code (using Code-First)

learc83 11 years ago

Is there something like YeSQL for Go?

  • elithrar 11 years ago

    Not that I've seen, but I've love to see it. You could probably build it with something like text/template and write:

        SELECT id, title, expiry_date FROM posts WHERE id = {{ id }} AND expiry_date > {{ date }} ORDER BY expiry_date
    

    This would give you some flexibility--the template tags would be replaced by ? or $1 or :name depending on the database driver set on parse--and leans on the stdlib. This would tie in nicely with https://github.com/jmoiron/sqlx and its struct tag/marshaling behaviour.

    Access could be handled by a map[string]string, with the key as the filename and the value as the SQL (as a first thought).

    • NDizzle 11 years ago

      Forgive my ignorance as I'm not very familiar with golang, but what you're describing above looks (to me) like a parameterized query. Nothing real fancy about that. I'd assume that Go has support for parameterized queries already.

  • cmelbye 11 years ago
      getUser, _ := db.Prepare("select * from users where username=$1 limit 1")
      row := getUser.QueryRow("john")
walshemj 11 years ago

If you have been writing raw sql for years I hope you did not use use select star