Java Application Manual Instrumentation for Distributed Traces

Manual instrumentation provides enhanced insight into the operations of distributed systems. By instrumenting your Java applications manually, you gain greater control over the data you collect, leading to improved visibility across your distributed architecture.
In this blog series, we are covering application instrumentation steps for distributed tracing with OpenTelemetry standards across multiple languages. Earlier, we covered Golang Application Instrumentation for Distributed Traces and DotNet Application Instrumentation for Distributed Traces. Here we are going to cover the instrumentation for Java.
OpenTelemetry is a set of libraries, APIs, agents, and tools designed to capture, process, and export telemetry data—specifically traces, logs, and metrics—from distributed systems. It’s vendor-neutral and open-source, which means your business has interoperability and freedom of choice to implement observability systems across a wide range of services and technologies.
You can break OpenTelemetry down into a few main concepts: signals, APIs, context and propagation, and resources and semantic conventions.
Signals in OpenTelemetry are traces, metrics, and logs. Traces represent the end-to-end latency in your operation across services. They are composed of spans, which are named individual units of work with start and end timestamps and contextual attributes.
Metrics are the qualitative measurements over time (CPU usage, memory usage, disc usage) that help you understand the overall performance of your application. Logs, on the other hand, are records of events that occur on systems that provide insights into errors and other events.
OpenTelemetry defines a language-agnostic API that helps teams create code that implements the API to collect and process data and export it to their chosen backends. The API allows anyone to collect the same data, whether using custom software or an out-of-the-box monitoring solution, allowing them to process data on their own terms and tailor a monitoring solution based on their needs.
Context is a concept used to share data (like span context) between code and networks. Context propagation ensures that distributed traces stay connected as requests travel across networks through different services—helping teams get a holistic view across the entire infrastructure.
A resource is what provides information about the entity producing data. It contains information like the host name, device environment, and host details. Semantic conventions are the standardized attributes and naming conventions that make telemetry data more consistent and allow any environment to uniformly interpret the data without worrying about variations in data output.
Understanding these concepts will help you decipher telemetry output and get started with your OpenTelemetry projects. So, let’s start by setting up a new project.
Custom instrumentation in Java applications allows developers to capture more granular telemetry data beyond what automatic instrumentation provides. By manually defining spans and adding attributes, teams can gain deeper insights into specific application behaviors and business logic within a distributed system.
Attributes are key-value pairs attached to spans, providing contextual metadata about an operation. These attributes can include details such as user IDs, transaction types, HTTP request details, or database queries. By adding relevant attributes, developers can enhance traceability, making it easier to filter and analyze performance data based on meaningful application-specific insights.
Multi-span attributes allow developers to maintain consistency across spans by propagating key metadata across multiple operations. This is especially useful when tracking a request across services, ensuring that relevant information, such as correlation IDs or session details, remains linked throughout the trace.
To begin, create a new Java project and add the below dependencies that are required for OpenTelemetry manual instrumentation.
<project>
<dependencyManagement>
<dependencies>
<dependency>
<groupId>io.opentelemetry</groupId>
<artifactId>opentelemetry-bom</artifactId>
<version>1.2.0</version>
<type>pom</type>
<scope>import</scope>
</dependency>
</dependencies>
</dependencyManagement>
<dependencies>
<dependency>
<groupId>io.opentelemetry</groupId>
<artifactId>opentelemetry-api</artifactId>
</dependency>
<dependency>
<groupId>io.opentelemetry</groupId>
<artifactId>opentelemetry-sdk</artifactId>
</dependency>
<dependency>
<groupId>io.opentelemetry</groupId>
<artifactId>opentelemetry-exporter-otlp</artifactId>
</dependency>
<dependency>
<groupId>io.opentelemetry</groupId>
<artifactId>opentelemetry-semconv</artifactId>
<version>1.5.0-alpha</version>
</dependency>
<dependency>
<groupId>io.grpc</groupId>
<artifactId>grpc-netty-shaded</artifactId>
<version>1.39.0</version>
</dependency>
</dependencies>
</project>
dependencies {
implementation platform("io.opentelemetry:opentelemetry-bom:1.2.0")
implementation('io.opentelemetry:opentelemetry-api')
implementation('io.opentelemetry:opentelemetry-sdk')
implementation('io.opentelemetry:opentelemetry-exporter-otlp')
implementation('io.opentelemetry:opentelemetry-semconv:1.5.0-alpha')
implementation('io.grpc:grpc-netty-shaded:1.39.0')
}
It is recommended to use OpenTelemetry BOM to keep the version of the various components in sync.
If you are developing a library that is going to be used by some other final application, then your code will have dependency only on opentelemetry-api.
Distributed tracing lets you pinpoint performance bottlenecks, but manual instrumentation gives you the precision to solve them.
The resource describes the object that generated the Telemetry signals. Essentially, it must be the name of the service or application. OpenTelemetry has defined the standards to describe the service execution env, viz. hostname, hostType (cloud, container, serverless), namespace, cloud-resource-id, etc. These attributes are defined under Resource Semantic Conventions or semconv.
Here we will be creating a resource with some environmental attributes.
Attribute | Description | Required |
service.name | It is the logical name of the service. | Yes |
service.namespace | It is used to group the services.For example, you can use service.namespace to distinguish services across environments like QA,UAT,PROD. | No |
host.name | Name of the host where the service is running. | No |
//Create Resource
AttributesBuilder attrBuilders = Attributes.builder()
.put(ResourceAttributes.SERVICE_NAME, SERVICE_NAME)
.put(ResourceAttributes.SERVICE_NAMESPACE, "US-West-1")
.put(ResourceAttributes.HOST_NAME, "prodsvc.us-west-1.example.com");
Resource serviceResource = Resource
.create(attrBuilders.build());
The exporter is the component in SDK responsible for exporting the Telemetry signal (trace) out of the application to a remote backend, log to a file, stream to stdout., etc.
Consider how distributed tracing impacts system performance. Proper trace sampling can help balance the need for detailed traces with overall system efficiency, preventing performance slowdowns or data overload.
In this example, we are creating a gRPC exporter to send out traces to an OTLP receiver backend running on localhost:55680. Possibly an OTEL Collector.
//Create Span Exporter
OtlpGrpcSpanExporter spanExporter = OtlpGrpcSpanExporter.builder()
.setEndpoint("http://localhost:55680")
.build();
Using TracerProvider you can access Tracer, a key component in Java performance monitoring, that is used to create spans and track performance metrics.
//Create SdkTracerProvider
SdkTracerProvider sdkTracerProvider = SdkTracerProvider.builder()
.addSpanProcessor(BatchSpanProcessor.builder(spanExporter)
.setScheduleDelay(100, TimeUnit.MILLISECONDS).build())
.setResource(serviceResource)
.build();
//This Instance can be used to get tracer if it is not configured as global
OpenTelemetry openTelemetry = OpenTelemetrySdk.builder()
.setTracerProvider(sdkTracerProvider)
.buildAndRegisterGlobal();
You need to configure the SDK and create the tracer as a first step in your application.
With the right configuration in place, developers can monitor their application’s performance in real-time. This enables quick adjustments and optimization, allowing you to address issues or enhance performance as soon as they arise.
Tracer tracer= GlobalOpenTelemetry.getTracer("auth-Service-instrumentation");
//Tracer tracer= GlobalOpenTelemetry.getTracer("auth-Service-instrumentation","1.0.0");
//OR use the OpenTelemetry instance from previous step to get tracer
//openTelemetry.getTracer("auth-Service-instrumentation");
You can use GlobalOpenTelemetry only If your OpenTelemery instance is registered as global in the previous step or else you can use the OpenTelemetry instance returned by SDK builder.
The getTracer method requires an instrumentation library name as a parameter, which must not be null.
Using GlobalOpenTelemetry is essential for tracing intricate processes across multiple services. By enabling this, you streamline the tracing of multi-step workflows and boost overall operational efficiency, ensuring smooth and optimized system performance.
Creating and managing spans efficiently is the next step after setting up your OpenTelemetry instrumentation. Properly defining, structuring, and annoying spans will help you understand how your operations flow through your system and help when troubleshooting problems.
A few things help make good spans: span attributes, child spans, and events.
There are also a few best practices to consider to get the most out of your telemetry, some of which include:
Understanding these fundamentals will help your organization optimize your instrumentation to produce more meaningful telemetry. With that, let’s look at some examples of how to create and manage your spans effectively.
Even with well-structured spans, OpenTelemetry instrumentation can sometimes present challenges. Some common troubleshooting techniques include:
By default, OpenTelemetry uses gRPC for exporting telemetry data. However, in some cases, HTTP-based transport methods can be a better alternative, especially when working with legacy systems, firewalls, or monitoring tools that do not support gRPC.
The span is a single execution of an operation. It is identified by a set of attributes, which are sometimes referred to as span tags. Application owners are free to choose the attributes that can capture the required information for the spans. There is no limit to the number of span attributes per span.
In this example, we are defining two-span attributes for our sample applications.
Span parentSpan = tracer.spanBuilder("doLogin").startSpan();
parentSpan.setAttribute("priority", "business.priority");
parentSpan.setAttribute("prodEnv", true);
You can use the setParent method to correlate spans manually.
Span childSpan = tracer.spanBuilder("child")
.setParent(Context.current().with(parentSpan))
.startSpan();
The OpenTelemetry API also offers an automated way to propagate the parent span on the current thread.
Use the makeCurrent method to automatically propagate the parent span on the current thread.
try (Scope scope = parentSpan.makeCurrent()) {
Thread.sleep(200);
boolean isValid=isValidAuth(username,password);
//Do login
} catch (Throwable t) {
parentSpan.setStatus(StatusCode.ERROR, "Change it to your error message");
} finally {
parentSpan
.end(); // closing the scope does not end the span, this has to be done manually
}
//Child Method
private boolean isValidAuth(String username,String password){
Span childSpan = tracer.spanBuilder("isValidAuth").startSpan();
// NOTE: setParent(...) is not required;
// `Span.current()` is automatically added as the parent
childSpan.setAttribute("Username", username)
.setAttribute("id", 101);
//Auth code goes here
try {
Thread.sleep(200);
childSpan.setStatus(StatusCode.OK);
} catch (InterruptedException e) {
childSpan.setStatus(StatusCode.ERROR, "Change it to your error message");
}finally {
childSpan.end();
}
return true;
}
Spans can be enriched with some execution logs/events that happened during the execution of the span. This information will help provide contextual logs always tied up with the respective span.
Attributes eventAttributes = Attributes.builder().put("Username", username)
.put("id", 101).build();
childSpan.addEvent("User Logged In", eventAttributes);
package com.logicmonitor.example;
import io.opentelemetry.api.OpenTelemetry;
import io.opentelemetry.api.common.Attributes;
import io.opentelemetry.api.common.AttributesBuilder;
import io.opentelemetry.exporter.otlp.trace.OtlpGrpcSpanExporter;
import io.opentelemetry.sdk.OpenTelemetrySdk;
import io.opentelemetry.sdk.resources.Resource;
import io.opentelemetry.sdk.trace.SdkTracerProvider;
import io.opentelemetry.sdk.trace.export.BatchSpanProcessor;
import io.opentelemetry.semconv.resource.attributes.ResourceAttributes;
import java.util.concurrent.TimeUnit;
public class TestApplication {
private static final String SERVICE_NAME = "Authentication-Service";
static {
//Create Resource
AttributesBuilder attrBuilders = Attributes.builder()
.put(ResourceAttributes.SERVICE_NAME, SERVICE_NAME)
.put(ResourceAttributes.SERVICE_NAMESPACE, "US-West-1")
.put(ResourceAttributes.HOST_NAME, "prodsvc.us-west-1.example.com");
Resource serviceResource = Resource
.create(attrBuilders.build());
//Create Span Exporter
OtlpGrpcSpanExporter spanExporter = OtlpGrpcSpanExporter.builder()
.setEndpoint("http://localhost:55680")
.build();
//Create SdkTracerProvider
SdkTracerProvider sdkTracerProvider = SdkTracerProvider.builder()
.addSpanProcessor(BatchSpanProcessor.builder(spanExporter)
.setScheduleDelay(100, TimeUnit.MILLISECONDS).build())
.setResource(serviceResource)
.build();
//This Instance can be used to get tracer if it is not configured as global
OpenTelemetry openTelemetry = OpenTelemetrySdk.builder()
.setTracerProvider(sdkTracerProvider)
.buildAndRegisterGlobal();
}
public static void main(String[] args) throws InterruptedException {
Auth auth = new Auth();
auth.doLogin("testUserName", "testPassword");
Thread.sleep(1000);
}
}
package com.logicmonitor.example;
import io.opentelemetry.api.GlobalOpenTelemetry;
import io.opentelemetry.api.common.Attributes;
import io.opentelemetry.api.trace.Span;
import io.opentelemetry.api.trace.StatusCode;
import io.opentelemetry.api.trace.Tracer;
import io.opentelemetry.context.Scope;
public class Auth {
Tracer tracer = GlobalOpenTelemetry.getTracer("auth-Service-instrumentation");
//Tracer tracer= GlobalOpenTelemetry.getTracer("auth-Service-instrumentation","1.0.0");
public void doLogin(String username, String password) {
Span parentSpan = tracer.spanBuilder("doLogin").startSpan();
parentSpan.setAttribute("priority", "business.priority");
parentSpan.setAttribute("prodEnv", true);
try (Scope scope = parentSpan.makeCurrent()) {
Thread.sleep(200);
boolean isValid = isValidAuth(username, password);
//Do login
} catch (Throwable t) {
parentSpan.setStatus(StatusCode.ERROR, "Change it to your error message");
} finally {
parentSpan
.end(); // closing the scope does not end the span, this has to be done manually
}
}
private boolean isValidAuth(String username, String password) {
Span childSpan = tracer.spanBuilder("isValidAuth").startSpan();
// NOTE: setParent(...) is not required;
// `Span.current()` is automatically added as the parent
//Auth code goes here
try {
Thread.sleep(200);
childSpan.setStatus(StatusCode.OK);
Attributes eventAttributes = Attributes.builder().put("Username", username)
.put("id", 101).build();
childSpan.addEvent("User Logged In", eventAttributes);
} catch (InterruptedException e) {
childSpan.setStatus(StatusCode.ERROR, "Change it to your error message");
} finally {
childSpan.end();
}
return true;
}
}
Run TestApplication.java.
Parent Span:
Child Span:
Congratulations, you have just written a Java application emitting traces using the OpenTelemetry Protocol (OTLP) Specification. Feel free to use this code as a reference when you get started with instrumenting your business application with OTLP specifications. LogicMonitor APM specification is 100% OTLP compliant with no vendor lock-in. To receive and visualize traces of multiple services for troubleshooting with the LogicMonitor platform, sign up for a free trial account here. Check back for more blogs covering application instrumentation steps for distributed tracing with OpenTelemetry standards across multiple languages.
Distributed tracing plays a crucial role in maintaining system stability and minimizing service disruptions. By monitoring traces across various components, you can ensure more reliable operation and higher uptime, even in complex environments. Unlock the full potential of distributed tracing with LogicMonitor’s powerful monitoring platform.
© LogicMonitor 2025 | All rights reserved. | All trademarks, trade names, service marks, and logos referenced herein belong to their respective companies.
Blogs
Explore guides, blogs, and best practices for maximizing performance, reducing downtime, and evolving your observability strategy.