Skip to content

[feature request] support write byte[] to bytes_value instead of array_value in TagWriter #6535

@JunweiSUN

Description

@JunweiSUN

Package

OpenTelemetry.Exporter.OpenTelemetryProtocol

Is your feature request related to a problem?

When writing big byte[] object to the attribute of LogRecord using ProtobufOtlpTagWriter, current implementation of TagWriter will treat byte[] as array_value in the proto schema instead of bytes_value, which brings two major concerns:

  1. The serialization / deserialization speed is slow since we need to iterate each element in the array and set it into the array_value field
  2. The serialized size is bigger, since we have nested structure and TagWriter will cast byte to long.

Other folks also raised related issue #5724 about LogRecord should support write different type of values of LogRecord.Body according to opentelemetry's specification instead of only supporting string

What is the expected behavior?

I just created a PR #6534 to allow ProtobufOtlpTagWriter write byte[] as bytes_value directly, which will bring significant performance improvement when writing large byte[] objects.

Here is a simple benchmark which creates 20000 random byte arrays, the size of each array is 5000 bytes:

using Microsoft.Extensions.Logging;
using OpenTelemetry.Logs;
namespace OtlpLogClient;

class Program
{
    static void Main(string[] args)
    {
        using var loggerFactory = LoggerFactory.Create(builder =>
        {
            builder.AddOpenTelemetry(options =>
            {
                options.AddOtlpExporter(otlpOptions =>
                {
                    otlpOptions.Endpoint = new Uri("http://localhost:4317");
                    otlpOptions.Protocol = OpenTelemetry.Exporter.OtlpExportProtocol.Grpc;
                });
            });
        });
        var logger = loggerFactory.CreateLogger("SimpleClient");

        Random random = new Random();
        List<byte[]> payloads = new List<byte[]>();
        var numOfRecords = 20000;
        var payloadSize = 5000;
        for (int i = 0; i < numOfRecords; i++)
        {
            var bytes = new byte[payloadSize];
            random.NextBytes(bytes);
            payloads.Add(bytes);
        }

        foreach (var payload in payloads)
        {
            logger.LogInformation("{payload}", payload);
        }

        Console.ReadKey();
    }
}

We log the serialization time, and the serialized buffer size by adding some logs in OtlpLogExporter.cs:

public override ExportResult Export(in Batch<LogRecord> logRecordBatch)
#pragma warning restore CA1725 // Parameter names should match base declaration
    {
        using var scope = SuppressInstrumentationScope.Begin();

        try
        {
            var before = DateTime.Now;
            int writePosition = ProtobufOtlpLogSerializer.WriteLogsData(ref this.buffer, this.startWritePosition, this.sdkLimitOptions, this.experimentalOptions, this.Resource, logRecordBatch);
            // log the serialization time
            Console.WriteLine($"Write LogsData uses: {(DateTime.Now - before).TotalMilliseconds}");

            if (this.startWritePosition == GrpcStartWritePosition)
            {
                Span<byte> data = new Span<byte>(this.buffer, 1, 4);
                var dataLength = writePosition - GrpcStartWritePosition;
                BinaryPrimitives.WriteUInt32BigEndian(data, (uint)dataLength);
                // log the serialized size
                Console.WriteLine($"Data Length: {dataLength}");
            }

            if (!this.transmissionHandler.TrySubmitRequest(this.buffer, writePosition))
            {
                return ExportResult.Failure;
            }
        }
        catch (Exception ex)
        {
            OpenTelemetryProtocolExporterEventSource.Log.ExportMethodException(ex);
            return ExportResult.Failure;
        }

        return ExportResult.Success;
    }

Current:

Write LogsData uses: 479.521
Data Length: 11557160
Write LogsData uses: 233.2154
Data Length: 11556995
Write LogsData uses: 232.3389
Data Length: 11557581
Write LogsData uses: 285.0561
Data Length: 11559325
Write LogsData uses: 243.3509
Data Length: 11557485

With #6534 :

Data Length: 2596552
Write LogsData uses: 3.0461
Data Length: 2596552
Write LogsData uses: 1.107
Data Length: 2596552
Write LogsData uses: 1.0744
Data Length: 2596552
Write LogsData uses: 1.1594
Data Length: 2596552

There is a 57.7x serialization time improvement and a 4.5x serialized size improvement.

Please correct me if there is anything wrong in the benchmark.

Which alternative solutions or features have you considered?

N/A

Additional context

No response

Tip

React with 👍 to help prioritize this issue. Please use comments to provide useful context, avoiding +1 or me too, to help us triage it. Learn more here.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requestneeds-triageNew issues which have not been classified or triaged by a community member

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions