brandur.org

Sqlc: 2024 check in

It’s been almost three years since I wrote about the use of sqlc in our Go stack. The other night someone asked me if I still thought that this was the right direction, so I thought I’d do a quick check in about it.

We use sqlc for every database call in our app, and we’re now up to a little under 700 queries across 101 files (we use one model per file, so that’s ~100 domain models), with ~7,800 lines of SQL and 34k lines of generated Go.

$ ag --stats '^\-\- name: ' dbsqlc/
...
78:-- name: DiscountItemGetByInvoiceID :many
91:-- name: DiscountItemDeleteByInvoiceID :many

dbsqlc/email_verification.sql
17:-- name: EmailVerificationGetByID :one
22:-- name: EmailVerificationGetByTokenHash :one
30:-- name: EmailVerificationGetUnusedByAccountIDAndEmail :many
39:-- name: EmailVerificationGetUnusedByTeamIDAndEmail :many
48:-- name: EmailVerificationInsert :one
78:-- name: EmailVerificationUpdate :one
86:-- name: EmailVerificationDeleteExpiredWithLimit :one

682 matches
101 files contained matches
287 files searched
1383455 bytes searched
0.015354 seconds
$ cloc dbsqlc/
     287 text files.
     287 unique files.
       0 files ignored.

github.com/AlDanial/cloc v 2.00  T=0.21 s (1343.5 files/s, 227367.4 lines/s)
------------------------------------------------------------
Language           files       blank      comment       code
------------------------------------------------------------
Go                   184        2943         1232      33928
SQL                  101         867         1408       7785
YAML                   2           0            6        401
------------------------------------------------------------
SUM:                 287        3810         2646      42114
------------------------------------------------------------

Generating code is still fast, taking < 1s across two packages on a commodity MacBook Air (on battery):

$ time make generate/sqlc
cd dbsqlc && sqlc generate
cd dbsqlc/dbsqlcephemeral && sqlc generate

real    0.84s
user    0.74s
sys     0.09s

Sqlc’s achilles heel continues to be a lack of any kind of mechanism for putting dynamic code into queries, making it difficult to do things like apply an arbitrary filter or update an arbitrary field.

An example of where this shows up are queries for list endpoints, which might have one or more filters applied in arbitrary combinations. To work around dynamic code not being possible, we have to make creative use of booleans and CASE statements instead. Feast your eyes on this monster:

-- name: SavedQueryGetPage :many
SELECT *
FROM saved_query
WHERE
    CASE
        WHEN @by_account_id::boolean THEN account_id = @account_id::uuid
        WHEN @by_cluster_id::boolean THEN cluster_id = @cluster_id::uuid
        WHEN @by_saved_query_folder_id::boolean THEN saved_query_folder_id = @saved_query_folder_id::uuid
        WHEN @by_saved_query_folder_id_null::boolean THEN saved_query_folder_id IS NULL
        WHEN @by_team_id::boolean THEN team_id = @team_id::uuid
        ELSE false -- protect against forgotten predicate
    END
    AND
        CASE WHEN @cursor_specified::boolean THEN
            CASE WHEN @by_id::boolean   AND NOT @descending::boolean THEN id::text    > @cursor_threshold::text
                 WHEN @by_id            AND     @descending          THEN id::text    < @cursor_threshold
                 WHEN @by_name::boolean AND NOT @descending          THEN lower(name) > lower(@cursor_threshold)
                 WHEN @by_name          AND     @descending          THEN lower(name) < lower(@cursor_threshold)
            END
        ELSE
            id = id
        END
ORDER BY
    CASE WHEN @by_id   AND NOT @descending THEN id          END ASC,
    CASE WHEN @by_id   AND     @descending THEN id          END DESC,
    CASE WHEN @by_name AND NOT @descending THEN lower(name) END ASC,
    CASE WHEN @by_name AND     @descending THEN lower(name) END DESC
LIMIT @max;

Another place are update queries, which have similar problems. Take a look at this behemoth:

-- name: ClusterUpdate :one
UPDATE cluster
SET
    archived_at = CASE WHEN @archived_at_do_update::boolean THEN @archived_at ELSE archived_at END,
    cluster_group_id = CASE WHEN @cluster_group_id_do_update::boolean THEN @cluster_group_id ELSE cluster_group_id END,
    created_at = CASE WHEN @created_at_do_update::boolean THEN @created_at ELSE created_at END,
    disk_available_mb = CASE WHEN @disk_available_mb_do_update::boolean THEN @disk_available_mb ELSE disk_available_mb END,
    disk_total_size_mb = CASE WHEN @disk_total_size_mb_do_update::boolean THEN @disk_total_size_mb ELSE disk_total_size_mb END,
    disk_used_mb = CASE WHEN @disk_used_mb_do_update::boolean THEN @disk_used_mb ELSE disk_used_mb END,
    environment = CASE WHEN @environment_do_update::boolean THEN @environment ELSE environment END,
    host = CASE WHEN @host_do_update::boolean THEN @host ELSE host END,
    is_ha = CASE WHEN @is_ha_do_update::boolean THEN @is_ha ELSE is_ha END,
    is_protected = CASE WHEN @is_protected_do_update::boolean THEN @is_protected ELSE is_protected END,
    is_suspended = CASE WHEN @is_suspended_do_update::boolean THEN @is_suspended ELSE is_suspended END,
    is_trial = CASE WHEN @is_trial_do_update::boolean THEN @is_trial ELSE is_trial END,
    keychain_id = CASE WHEN @keychain_id_do_update::boolean THEN @keychain_id ELSE keychain_id END,
    last_resize_at = CASE WHEN @last_resize_at_do_update::boolean THEN @last_resize_at ELSE last_resize_at END,
    last_resumed_at = CASE WHEN @last_resumed_at_do_update::boolean THEN @last_resumed_at ELSE last_resumed_at END,
    last_seen_from_owl_at = CASE WHEN @last_seen_from_owl_at_do_update::boolean THEN @last_seen_from_owl_at ELSE last_seen_from_owl_at END,
    major_version = CASE WHEN @major_version_do_update::boolean THEN @major_version ELSE major_version END,
    maintenance_window_start = CASE WHEN @maintenance_window_start_do_update::boolean THEN @maintenance_window_start  ELSE maintenance_window_start END,
    marketplace_resource_id = CASE WHEN @marketplace_resource_id_do_update::boolean THEN @marketplace_resource_id ELSE marketplace_resource_id END,
    name = CASE WHEN @name_do_update::boolean THEN @name ELSE name END,
    network_id = CASE WHEN @network_id_do_update::boolean THEN @network_id ELSE network_id END,
    parent_id = CASE WHEN @parent_id_do_update::boolean THEN @parent_id ELSE parent_id END,
    plan_id = CASE WHEN @plan_id_do_update::boolean THEN @plan_id ELSE plan_id END,
    postgres_version_id = CASE WHEN @postgres_version_id_do_update::boolean THEN @postgres_version_id ELSE postgres_version_id END,
    reset_stats_last_issued_at = CASE WHEN @reset_stats_last_issued_at_do_update::boolean THEN @reset_stats_last_issued_at ELSE reset_stats_last_issued_at END,
    reset_stats_next_eligible_at = CASE WHEN @reset_stats_next_eligible_at_do_update::boolean THEN @reset_stats_next_eligible_at ELSE reset_stats_next_eligible_at END,
    reset_stats_weekly = CASE WHEN @reset_stats_weekly_do_update::boolean THEN @reset_stats_weekly ELSE reset_stats_weekly END,
    state = CASE WHEN @state_do_update::boolean THEN @state ELSE state END,
    storage = CASE WHEN @storage_do_update::boolean THEN @storage ELSE storage END
WHERE id = @id
RETURNING *;

Not great, but most queries look quite a bit nicer than this, and even the ugly ones aren’t as bad to maintain as they look. After a while you’ve established common patterns that plug all the major holes, and you copy/paste them as needed. It’s manageable.

The use of raw SQL remains an advantage overall. We’re able to express very complex queries involving CTEs and subqueries that would look like absolute train wrecks if encoded in the DSL of an ORM, and the first pass of validation through Postgres’ parser means they’re much less error prone to write and iterate on. See some complex ones in River for example.

Overall, my answer to “is sqlc the right choice” is a resounding yes.

Even with almost eight thousand lines of SQL now, the tooling remains fast, and day-to-day quality of life is outstanding.

I can’t prove it, but I’m fairly sure that if we’d been using one of the other paths in Go like building queries by concatenating strings together, or using one of the untyped ORMs (both of which we tried prior to sqlc), we would’ve long ago worked ourselves into a place that was completely unsustainable, and might even be frozen out by now (or slowed to a very significant extent) as the difficulty of the process made any addition or change long and arduous.

Did I make a mistake? Please consider sending a pull request.