Skip to content

All cluster resources being claimed by actors ? #58

Open
@Chuukwudi

Description

@Chuukwudi

On the notebook, calling

# Embed chunks
embedding_model_name = "thenlper/gte-base"
embedded_chunks = chunks_ds.map_batches(
    EmbedChunks,
    fn_constructor_kwargs={"model_name": embedding_model_name},
    batch_size=100, 
    num_gpus=1,
    compute=ActorPoolStrategy(size=2))

# Sample
sample = embedded_chunks.take(1)

results to:

======== Autoscaler status: 2023-09-19 10:15:05.945390 ========
Node status
---------------------------------------------------------------
Healthy:
 1 node_39e554d28e4f63b9d3360ffdf267014a901a29d1601c039967717f26
Pending:
 (no pending nodes)
Recent failures:
 (no failures)

Resources
---------------------------------------------------------------
Usage:
 1.0/32.0 CPU
 1.0/1.0 GPU
 0B/10.09GiB memory
 11.70MiB/5.05GiB object_store_memory

Demands:
 {'CPU': 1.0, 'GPU': 1.0}: 1+ pending tasks/actors
(autoscaler +2m17s) Warning: The following resource request cannot be scheduled right now: {'CPU': 1.0, 'GPU': 1.0}. This is likely due to all cluster resources being claimed by actors. Consider creating fewer actors or adding more nodes to this Ray cluster.
(autoscaler +2m52s) Warning: The following resource request cannot be scheduled right now: {'CPU': 1.0, 'GPU': 1.0}. This is likely due to all cluster resources being claimed by actors. Consider creating fewer actors or adding more nodes to this Ray cluster.

Any solution ?
I have tried changing ActorPoolStrategy to size 1 and reducing batch_size yet the same old story.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions