meilisearch/MeiliSearch

[investigation] LMDB RAM consumption is not restricted

Open

#4764 opened on Jul 3, 2024

View on GitHub
 (5 comments) (2 reactions) (0 assignees)Rust (20,887 stars) (733 forks)batch import
CPU/RAM usagegood first issuespike

Description

Related to https://github.com/meilisearch/meilisearch/issues/3397

Summary

When indexing or changing major settings on a big dataset on a Meilisearch instance, restricting the indexing memory using the flag --max-indexing-memory 2GB, the RAM consumption is most of the time higher than the threshold:

8GB wiki:
    full indexing: spike at 5.37GB
    change filter setting: spike at 3.02GB
    change the searchable setting: pike at 2.47GB but 7.85GB when writing words_prefix_integer_docids
1GB songs:
    full indexing: spike at 5.00GB
    change filter setting: spike at 1.91GB
    change the searchable setting: pike at 3.44GB but 4.32GB when writing words_prefix_integer_docids

In the previous issue, this overuse was due to the extractors. It doesn't seem to be the case anymore, but it is due to LMDB writings. LMDB has no restrictions in terms of memory usage for now. However, it's not because LMDB uses a lot of RAM on a big machine (more than 8GB of RAM) that it would make a smaller machine crash.

Next steps:

  • retry using the MDB_WRITEMAP flag
  • test the same use case on a machine with 4GB of maximum RAM and see if the indexing fails
  • investigate a way to constrain the LMDB RAM consumption

Contributor guide