Skip to content

Problem: Archival storage fails when there are too many packages with deletion requests #1792

@replaceafill

Description

@replaceafill

Expected behaviour

The Archival storage tab always loads successfully.

Current behaviour

When there are many packages with pending deletion requests in the Storage Service, the Archival storage tab fails with:

ERROR     2026-04-08 00:45:02  django.request:log:log_response:253:  Internal Server Error: /archival-storage/
Traceback (most recent call last):
  File "/pyenv/data/versions/3.10.20/lib/python3.10/site-packages/django/core/handlers/exception.py", line 55, in inner
    response = get_response(request)
  File "/pyenv/data/versions/3.10.20/lib/python3.10/site-packages/django/core/handlers/base.py", line 197, in _get_response
    response = wrapped_callback(request, *callback_args, **callback_kwargs)
  File "/src/src/archivematica/dashboard/components/archival_storage/views.py", line 136, in execute
    total_size = total_size_of_aips(search_service)
  File "/src/src/archivematica/dashboard/components/archival_storage/views.py", line 766, in total_size_of_aips
    results = search_service.search_aips(query, size=0)
  File "/src/src/archivematica/search/service.py", line 872, in search_aips
    return self._search_index(self.aips_index, query, size, from_, sort, fields)
  File "/src/src/archivematica/search/service.py", line 851, in _search_index
    return dict(self.client.search(**search_params))
  File "/pyenv/data/versions/3.10.20/lib/python3.10/site-packages/elasticsearch/_sync/client/utils.py", line 458, in wrapped
    return api(*args, **kwargs)
  File "/pyenv/data/versions/3.10.20/lib/python3.10/site-packages/elasticsearch/_sync/client/__init__.py", line 5114, in search
    return self.perform_request(  # type: ignore[return-value]
  File "/pyenv/data/versions/3.10.20/lib/python3.10/site-packages/elasticsearch/_sync/client/_base.py", line 271, in perform_request
    response = self._perform_request(
  File "/pyenv/data/versions/3.10.20/lib/python3.10/site-packages/elasticsearch/_sync/client/_base.py", line 351, in _perform_request
    raise HTTP_EXCEPTIONS.get(meta.status, ApiError)(
elasticsearch.BadRequestError: BadRequestError(400, 'search_phase_execution_exception', 'failed to create query: maxClauseCount is set to 2520')

The error is caused by the elasticsearch_query_excluding_aips_pending_deletion helper which creates a separate term for each package with a deletion request which are then combined in a single must_not clause.

Steps to reproduce

I created 10k "artificial" AIP packages in the Storage Service database with 9k of them having pending deletion requests.

Your environment (version of Archivematica, operating system, other relevant details)

artefactual/archivematica@1fc51d4
artefactual/archivematica-storage-service@1264f74


For Artefactual use:

Before you close this issue, you must check off the following:

  • All pull requests related to this issue are properly linked
  • All pull requests related to this issue have been merged
  • A testing plan for this issue has been implemented and passed (testing plan information should be included in the issue body or comments)
  • Documentation regarding this issue has been written and merged (if applicable)
  • Details about this issue have been added to the release notes (if applicable)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions