Proposed change: Replace `NonAggregate` with something less broken that can support `GROUP BY`


#1

A few notes: This has actually been fully implemented, see https://github.com/diesel-rs/diesel/tree/sg-group-by for the actual code. Replace () with NoGroupByClause to see how it relates to GROUP BY support.

This proposal uses ValidGrouping as the name of the replacement trait, but that name isn’t set in stone and I don’t like it very much. Feel free to bike shed it.

Description of the change

  • NonAggregate has been removed in favor of ValidGrouping. For most types you can replace impl NonAggregate with #[derive(ValidGrouping)]. For expressions which are aggregate functions, or otherwise care about aggregation/group by clauses, see the docs for details on how to implement this trait.

Why is this needed?

NonAggregate is broken. I originally thought this was just about GROUP BY support, but it’s not. NonAggregate is literally broken and we accept all sorts of queries that we shouldn’t.

First, let’s clarify the role of this trait. Since Diesel 1.0 does not support GROUP BY clauses (and we knew supporting them would require a 2.0 release), the only time an aggregate function can appear in the select clause is if the only time a column appears is inside of an aggregate function. This means that the query has an implicit GROUP BY 1. The whole table will be grouped, and the query returns 1 row. Aggregate and non-aggregate expressions cannot be mixed. (Note: It doesn’t matter if your backend supports this or not. At best your query is non-deterministic, and Diesel still wants to prevent it. In cases where backends differ in permissiveness, we use the most restrictive semantics, which usually means PG).

The way we implemented this is by making tuples only implement Expression if all values are NonAggregate. This has the side effect of making multiple aggregate expressions impossible, but that’s a niche use case that was deemed acceptable to ignore (sql can always be used to bypass this).

Except this is broken. Tuples aren’t the only way to make multiple expressions appear in a select clause. As a trivial example, .select((id, max(id))) will fail to compile, but select(id + max(id)) will compile just fine. A lot of the reasons for this limited previous implementation was due to the expectation that actually implementing this properly would require some form of mutual exclusion (likely disjointness based on associated types), but that simply turned out to not be true.

To fix all these shortcomings, a replacement for non-aggregate should meet all of the following:

  • It must be a direct requirement of arguments to SelectDsl, not just a bound on impl Expression for Tuple.
  • The trait should indicate whether an expression is valid with regards to aggregation, but not specifically imply aggregate/non-aggregate
  • Whether an expression is aggregate or not is communicated via an associated type
  • Any compound expression (tuples, functions with multiple arguments, operators, etc) are only valid if all sub-expressions have the same value for this type
  • It should be generic over the group by clause (this proposal does not require the implementation of GROUP BY support, but the type NoGroupByClause should be added so we can easily and backwards compatibly add proper GROUP BY support in the future)

Why can’t this have a deprecation cycle?

We can have NonAggregate imply ValidGrouping, or vice versa, but not both. More concretely, we can have either of these two impls (but not both):

impl<T> NonAggregate for T
where
    T: ValidGrouping<NoGroupByClause, IsAggregate = NotAggregate>,
{
}

impl<T> ValidGrouping<NoGroupByClause> for T
where
    T: NonAggregate,
{
    type IsAggregate = NotAggregate;
}

With either impl, we break the contract that impl NonAggregate for MyType means (MyType, OtherType): NonAggregate.

Expected user impact

Unknown, but likely small to non-existent. Code doing these things would be affected:

impl NonAggregate for MyType

I’m not sure if that actually exists in the wild. In theory it should, since none of our derives implement this trait (sql_function! and diesel_*_operator! do). Most things that have a manual impl Expression should need to do this, but it’s also definitely possible to forget, since it only affects using the expression in tuples, where clauses, and order clauses…

It definitely should theoretically exist somewhere though, but I can’t find it.

The other kind of code which would be broken is code with a where clause that includes T: NonAggregate. I doubt such code exists outside of Diesel, as I’ve been unable to think of any reason to do so. I guess someone might be writing T: Expression + NonAggregate, U: Expression + NonAggregate instead of (T, U): Expression?

Migration strategy

In a minor version prior to 2.0, we introduce derives for SelectableExpression and NonAggregate (#[derive(SelectableExpression)] implies #[derive(NonAggregate)]. We don’t add an explicit derive for AppearsOnTable, since you should never need a manual impl for that, but a derived one for SelectableExpression).

This should mitigate any unlikely breakage that occurs. Code in the wild that is broken by this change will either need to migrate to one of those derives if they haven’t already, or make the following changes:

For the vast majority of expressions, the following changes will be made:

impl<T> NonAggregate for MyType<T>
where
    T: NonAggregate

will change to

impl<T, GroupBy> ValidGrouping<GroupBy> for MyType<T>
where
    T: ValidGrouping<GroupBy>,
{
    type IsAggregate = T::IsAggregate;
}

Code which looks like:

impl<T, U> NonAggregate for MyType<T, U>
where
    T: NonAggregate,
    U: NonAggregate,
{
}

will change to:

impl<T, U, IsAggregate, GroupBy> ValidGrouping<GroupBy> for MyType<T, U>
where
    T: ValidGrouping<GroupBy, IsAggregate = IsAggregate>,
    U: ValidGrouping<GroupBy, IsAggregate = IsAggregate>,
{
    type IsAggregate = IsAggregate;
}

Any code with a where clause of T: NonAggregate will change to T: ValidGrouping<NoGroupByClause, IsAggregate = NotAggregate>

Any types which manually implement Expression, but do not implement NonAggregate intentionally should either switch to using sql_function!, or will need to add:

impl<T> ValidGrouping<GroupBy> for MyType<T>
where
    T: ValidGrouping<GroupBy, IsAggregate = NotAggregate>,
{
    type IsAggregate = Aggregate;
}

Note that unless such type already had a T: NonAggregate bound on its Expression impl, this changes the semantics of that type. This likely affects all aggregate expressions, but is a bug fix.

One other point of note is that I would like to keep NonAggregate around for use outside of select statements, and to provide a nice semantic separation between “I care about whether this is a valid aggregate statement in the context of a select statement given a group by clause” vs “I evaluate this expression before grouping”. In theory we could just keep it with the blanket impl given above. However, given that we cannot prevent manual implementations of a trait, that will likely do more harm than good.

For that reason, I propose we remove NonAggregate in 2.0, and re-add it in 2.1. This will cause churn, but should force any manual impls to switch to the new trait. Alternatively, if trait aliases are stable at the time 2.0 is released, we can instead do trait NonAggregate = ValidGrouping<NoGroupByClause, IsAggregate = No>.