Skip to content

otel-auto-instrumentation-python - __sched_cpufree: symbol not found #3573

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
jpicara opened this issue Jun 10, 2025 · 5 comments
Open

otel-auto-instrumentation-python - __sched_cpufree: symbol not found #3573

jpicara opened this issue Jun 10, 2025 · 5 comments
Labels
bug Something isn't working

Comments

@jpicara
Copy link

jpicara commented Jun 10, 2025

Describe your environment

No response

What happened?

Hello,
I am trying to get the auto instrumentation for my Python apps. It was successfully tested for my Java components.
Instrumentation CRD (based on https://opentelemetry.io/docs/platforms/kubernetes/operator/automatic/)


apiVersion: opentelemetry.io/v1alpha1
kind: Instrumentation
metadata:
  name: demo-instrumentation
spec:
  exporter:
    endpoint: http://monitoring-platform-alloy.platform.svc.cluster.local:4317
  propagators:
    - tracecontext
    - baggage
  sampler:
    type: parentbased_traceidratio
    argument: "1" 

Also added the needed annotations under spec.template.metadata.annotations in my app:

spec:
  template:
    metadata:
      annotations:
        arch-support/amd64: 'true'
        arch-support/arm64: 'true'
        instrumentation.opentelemetry.io/inject-python: platform/demo-instrumentation

Init-container is properly created named: opentelemetry-auto-instrumentation-python:
Successfully pulled image "ghcr.io/open-telemetry/opentelemetry-operator/autoinstrumentation-python:0.53b1
But got the following output on my desired application:


Importing of system_metrics failed, skipping it
Traceback (most recent call last):
  File "/otel-auto-instrumentation-python/opentelemetry/instrumentation/auto_instrumentation/_load.py", line 74, in _load_instrumentors
    distro.load_instrumentor(
  File "/otel-auto-instrumentation-python/opentelemetry/instrumentation/distro.py", line 61, in load_instrumentor
    instrumentor: BaseInstrumentor = entry_point.load()
  File "/otel-auto-instrumentation-python/importlib_metadata/__init__.py", line 189, in load
    module = import_module(match.group('module'))
  File "/usr/local/lib/python3.10/importlib/__init__.py", line 126, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 1050, in _gcd_import
  File "<frozen importlib._bootstrap>", line 1027, in _find_and_load
  File "<frozen importlib._bootstrap>", line 1006, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 688, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 883, in exec_module
  File "<frozen importlib._bootstrap>", line 241, in _call_with_frames_removed
  File "/otel-auto-instrumentation-python/opentelemetry/instrumentation/system_metrics/__init__.py", line 100, in <module>
    import psutil
  File "/otel-auto-instrumentation-python/psutil/__init__.py", line 95, in <module>
    from . import _pslinux as _psplatform
  File "/otel-auto-instrumentation-python/psutil/_pslinux.py", line 26, in <module>
    from . import _psutil_linux as cext
ImportError: Error relocating /otel-auto-instrumentation-python/psutil/_psutil_linux.abi3.so: __sched_cpufree: symbol not found

Steps to Reproduce

Created and applied the CRD, added the needed annotation and got the unexpected log output

Expected Result

Traces are being sent properly to my exporter backend.

Actual Result

Not traces visible getting the error message above exposed

Additional context

No response

Would you like to implement a fix?

None

@jpicara jpicara added the bug Something isn't working label Jun 10, 2025
@xrmx
Copy link
Contributor

xrmx commented Jun 10, 2025

@jpicara That stacktrace is a report of not being able to load psutil but auto-instrumentation should work just fine. What distribution are you using on the container image? If you are using a musl based one you should annotate the template to load the correct one.

@jpicara
Copy link
Author

jpicara commented Jun 10, 2025

Yeah, it is true that adding that under annotations I am not getting that stacktrace.

spec:
  replicas: 2
  selector:
    matchLabels:
      app: orders-manager
  template:
    metadata:
      annotations:
        arch-support/amd64: 'true'
        arch-support/arm64: 'true'
        instrumentation.opentelemetry.io/inject-python: platform/demo-instrumentation
        instrumentation.opentelemetry.io/otel-python-platform: musl

However I cannot still see any trace in my backend storage. This is how the CRD is configured

apiVersion: opentelemetry.io/v1alpha1
kind: Instrumentation
metadata:
  name: demo-instrumentation
spec:
  exporter:
    endpoint: http://monitoring-platform-alloy.platform.svc.cluster.local:4317
  propagators:
    - tracecontext
    - baggage
  sampler:
    type: parentbased_traceidratio
    argument: "1" 

There is not any error log in the applications nor in the opentelemetry operator. Forgot to add that I am using the opentelemetry-operator

      image: >-
        ghcr.io/open-telemetry/opentelemetry-operator/opentelemetry-operator:0.124.0
      imageID: >-
        ghcr.io/open-telemetry/opentelemetry-operator/opentelemetry-operator@sha256:dd647fe30d6e871e57a8a479784ed1582aa2f53d3f293dbcd6b5c7769ba73f5a
      containerID: >-
        containerd://9ba5bc66c244541718155fd9b9a68e81418af510e3aaff18c8c63c1ebab08e0c

Is there anything I am missing or something I could add to find where my traces "are lost" in case they are ingested.

@xrmx
Copy link
Contributor

xrmx commented Jun 10, 2025

Are there chances you are sending http traffic to a grpc endpoint? 😅

@jpicara
Copy link
Author

jpicara commented Jun 10, 2025

No, as collector is the same than Java components.

@jpicara
Copy link
Author

jpicara commented Jun 11, 2025

I can see these logs lines in the application related to opentelemetry


{"@timestamp": "2025-06-11T05:58:24.391Z", "log_type": "LOG", "@version": "1", "service_name": "orders-manager", "log_level": "ERROR", "logger_name": "scheduler.scheduler", "message": "Order instance deletion in bpmn failed with error Task <Task pending name='Task-205895' coro=<AsyncioInstrumentor.trace_coroutine() running at /otel-auto-instrumentation-python/opentelemetry/instrumentation/asyncio/__init__.py:288> cb=[shield.<locals>._inner_done_callback() at /usr/local/lib/python3.10/asyncio/tasks.py:847]> got Future <Future pending cb=[_chain_future.<locals>._call_check_cancel() at /usr/local/lib/python3.10/asyncio/futures.py:385]> attached to a different loop. Next try at [1749635904](Epoch Timestamp)", "artifact_id": "qflow-oom:3.8.2.20250311100159_develop_0a94653e", "level_value": 40000, "thread_name": "ThreadPoolExecutor-1_0", "trace_token": "b5f9cf7fcfded022d368a0e413aae346", "user_identity": null, "user_name": "anonymous"}
{"@timestamp": "2025-06-11T05:58:24.391Z", "log_type": "LOG", "@version": "1", "service_name": "orders-manager", "log_level": "ERROR", "logger_name": "scheduler.scheduler", "message": "Order instance deletion in bpmn failed with error Task <Task pending name='Task-205895' coro=<AsyncioInstrumentor.trace_coroutine() running at /otel-auto-instrumentation-python/opentelemetry/instrumentation/asyncio/__init__.py:288> cb=[shield.<locals>._inner_done_callback() at /usr/local/lib/python3.10/asyncio/tasks.py:847]> got Future <Future pending cb=[_chain_future.<locals>._call_check_cancel() at /usr/local/lib/python3.10/asyncio/futures.py:385]> attached to a different loop. Next try at [1749635904](Epoch Timestamp)"}
{"@timestamp": "2025-06-11T05:58:24.527Z", "log_type": "LOG", "@version": "1", "service_name": "orders-manager", "log_level": "ERROR", "logger_name": "scheduler.scheduler", "message": "delete bpmn instances for order 0LL-943-4JG0 failed with error Task <Task pending name='Task-205904' coro=<AsyncioInstrumentor.trace_coroutine() running at /otel-auto-instrumentation-python/opentelemetry/instrumentation/asyncio/__init__.py:288> cb=[shield.<locals>._inner_done_callback() at /usr/local/lib/python3.10/asyncio/tasks.py:847]> got Future <Future pending cb=[_chain_future.<locals>._call_check_cancel() at /usr/local/lib/python3.10/asyncio/futures.py:385]> attached to a different loop", "artifact_id": "qflow-oom:3.8.2.20250311100159_develop_0a94653e", "level_value": 40000, "thread_name": "ThreadPoolExecutor-1_1", "trace_token": "17c70ea4acda460dff00e0e0c233cfae", "user_identity": null, "user_name": "anonymous"}

Do you please have any clue why it is not working at all?
Thanks in advance!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants