Nozbe/WatermelonDB

RFC: New delete API

Open

#1016 opened on May 6, 2021

View on GitHub
 (12 comments) (5 reactions) (1 assignee)JavaScript (9,633 stars) (570 forks)batch import
help wanted

Description

RFC: New delete API

Goals:

  • Unified, simple, nice, understandable API for deleting records
  • Deprecate (and eventually remove) existing markAsDeleted, destroyPermanently, experimentalMarkAsDeleted, experimentalDestroyPermanently methods (which can be confusing)
  • Finally, high performance deletion of large trees of records (currently not possible)

Basic API

await model.remove()

This, by default, marks as deleted the model and all of its descendants.

remove() can only be called inside a writer action.

remove is used to avoid using delete, because it's a JS keyword. (I'm open to feedback, we can pick another word).

UPDATE: Unless I'm missing something, delete is only a keyword contextually so we actually can name the method delete()

Marking as deleted / destroying permanently:

Overwrite the default by explicitly stating the desired behavior:

await model.remove({ permanently: true }) // destroys permanently (actually removes from database)
await model.remove({ permanently: false }) // marks as deleted (so that deletion can be synced later)

Non-syncable apps

If the app does not use sync (is local-only), the Database instance can be configured to remove permanently by default:

const db = new Database({
  ...,
  // Not sure how to call this parameter, two ideas:
  removePermanentlyByDefault: true, // explicit name
  syncable: false, // more generic, could also control other behavior differences (can be a pro or con, not sure)
})

In that case, remove() destroys permanently by default.

I t would probably be possible to override it with { permanently: false }, but I'm not sure there are use cases for that -- maybe it's better to treat { syncable: false } as an optimization that skips keeping track of sync state altogether - WDYT?

I'm not entirely sure if this is the right design. I'd rather Model methods not have knowledge about Database configuration, and behave differently based on it. It feels wrong to me. And any library code to work with WatermelonDB would have to explicitly declare whether a remove is meant to be permanent or not.

On the other hand, unless you're syncing, using raw adapter methods, or unsafe raw SQL queries, the difference in behavior should be invisible - in both cases a removed record becomes invisible and unusable.

Also, for non-syncing apps, the need to specify { permanently: true } on every call would be very inconvenient and a regression (instead of an improvement) from just calling destroyPermanently() every time.

WDYT?

Descendants

By default, remove() removes all the record's descendants (children and their children, etc).

You can override this behavior and (possibly unsafely, depending on the app's semantics) force it to only remove the individual record:

// NOTE: I'm open to suggestions about this API, `descendants` is a confusing word for non-fluent English speakers
await model.remove({ descendants: false })

By default, descendants are determined using Model.associations. All tables that a model has_many of is considered to be descendant. For example:

export default class Comment extends Model {
  static associations = associations(
    ['tasks', { type: 'belongs_to', key: 'task_id' }], // parent
    ['attachments', { type: 'has_many', foreignKey: 'parent_id' }], // child (descendant)
  )
}

Removing a comment will also delete all attachments where attachment.parent_id == comment.id.

This behavior can be overriden by implementing static childrenToRemove:

export default class Comment extends Model {
  static childrenToRemove(ids) {
    // ids is an array of RecordIds of comments that are about to be deleted
    // return a list of queries that find all descendants that need to be deleted, like so:
    return [
      {
        table: 'attachments',
        query: Q.and(
          Q.where('parent_id', Q.oneOf(ids)),
          Q.where('parent_type', 'comment')
        ),
      },
    ]
  }
}

Note that this method isn't called on every comment to delete, but instead, it's called once on the Comment table to determine all children of all comments that are about to be deleted. If the returned record queries also have descendants, Watermelon will recursively call all their childrenToRemove(). This way, fast tree deletion can be achieved with multiple descendant levels, and it could also be extended to fast deleting a result of a query, not an individual record.

When childrenToRemove() is specified, associations are not consulted, so to override, you have to reimplement getting all children.

I'm also considering changing the definition so that instead of a simple static definition of children, an asynchronous function that returns actual records (not just a query) is used. This would be more powerful and might be necessary for some more complex app schemas, but it might also prevent some advanced optimizations. WDYT?

TODO: What about models that form cyclic graphs (a record can both have many and belong to records of the same type)? It would be easy to create an infinite loop without some special precautions. Do we need to worry about it? It's a relatively uncommon use case and Watermelon doesn't have great built-in support for this anyway, so maybe apps that need it can just worry about it on their own?

Query deletion

This removes all records matching the query:

await query.remove()

Again, { permanently: true/false } and { descendants: true/false } can be used to override the deletion behavior, and existing query.markAllAsDeleted() and query.destroyAllPermanently() are deprecated and will be removed.

"On delete" action

Sometimes, deleting a record triggers some additional action in addition to deleting its descendants.

In Nozbe Teams, there's one such case: Deleting a ProjectSection (a thing that organizes a Project into smaller chunks) should not delete the Tasks that belong to it - instead, they should simply be changed not to belong to any section.

We could create an API that goes something like this:

export default class ProjectSection extends Model {
  static onRemove(database, ids) {
    return database.write(async writer => {
      const tasks = database.get('tasks').query(Q.where('project_section_id', Q.oneOf(ids)))
      await writer.batch(
        tasks.map(task => task.prepareUpdate(() => {
          task.projectSection.set(null)
        }))
      )
    })
  }
}

However, there's a complication: if we want to remove the entire Project, we want to remove all its descendants, including all its tasks, and sections. We don't want sections' tasks to be moved to "without section", because they also get deleted. So how do we prevent this onRemove action to be called without breaking other use cases? When do we wait for onRemove to be executed? How do we ensure that onRemove doesn't change or remove records that we've just deleted/are about to delete? How would that work with Batching?

This makes me doubt that we want to have an "on delete" API anyway. I think in this case, the simplest thing to do is to create a custom removeSection method with this special behavior we can call on an individual section, and prevent users from accidentally calling section.remove(). This would preserve desired behavior for deleting the entire Project.

WDYT? I'm curious to hear other apps' use cases, maybe we do need some generic mechanism for this after all.

Batching

Currently, we can call record.prepareMarkAsDeleted() / record.prepareDestroyPermanently() to be able to make that change transactionally in a single batch with other changes. A requirement of prepareX methods is that they must be called synchronously with batch()… while remove() must be asynchronous.

This complicates the API, since we must split the remove operation into an asynchronous preparation of the list of records to remove, and a synchronous marking of those records.

Here's a few ideas:

// 1:
const preparedRemove = await record.prepareRemove({ ... })
await database.batch(...preparedRemove.prepare())

// 2:
const prepareRemove = await record.remove({ prepare: true })
await database.batch(...prepareRemove())

// 3 - preparedRemove here is a magic object that database.batch() understands and _it_ changes these Models
const preparedRemove = await record.prepareRemove({ ... })
await database.batch(preparedRemove)

WDYT?

Do we even need this? Does anyone care about running this remove in a single batch with other changes?

Deprecations & removals

The following APIs would be deprecated and removed in a later version:

Model.prepareMarkAsDeleted
Model.prepareDestroyPermanently
Model.markAsDeleted
Model.destroyPermanently
Model.experimentalMarkAsDeleted
Model.experimentalDestroyPermanently
Query.markAllAsDeleted
Query.destroyAllPermanently

Contributor guide