What is jaeger agent for in storj node application?

Kiwwiaq · January 19, 2025, 10:54pm

Hi,

I am setting up outbound firewall rules for the storj node and found blocked traffic to agent.tracing.datasci.storj.io:5775. I cannot find info on the forum related to this.

What is the purpose of this communication please?

From config.yaml:

#address for jaeger agent
#tracing.agent-addr: agent.tracing.datasci.storj.io:5775

#application name for tracing identification
#tracing.app: storagenode

Thank you.

Alexey · January 20, 2025, 2:50am

Hello @Kiwwiaq,
Welcome back!

You should not set outbound rules at all, they also must be either disabled or you need to add a permissive rule to allow connection from any port of your node to any host and any port in the internet.

This agent is sending the anonymous usage statistic from your node, you may disable it, if you want.

Kiwwiaq · January 20, 2025, 6:38pm

Hi,

considering the current climate on the internet, I would argue, that everybody absolutely should control any traffic coming in and out of managed networks.

Storj node operator should block all unknown traffic, allowing established and related connections and new traffic, that is recognized. Especially, when the storj software lives its own life and updates itself without node operator intervention required.

It could be argued, that IP addresses can change over time, so it is up to node operator to watch for changes, or allow connection by port without destination IP.

I currently see these new out connections:

version.storj.io:443 → regular version checks
github:443 → downloads from github
collectora.storj.io:9000 → software telemetry data collection
agent.tracing.datasci.storj.io:5775 → anonymous usage statistics
satelites:7777 → satelite communication
certs.alpha.storj.io:8888 → identity related only

Over last couple of days I have seen no other blocked connections, storj node reports all good so this would be a complete list for now.

Could you please point me to any documentation, that roughly describes, what data are collected by collectora.storj.io:9000 and agent.tracing.datasci.storj.io:5775? I have no problem to support by sending such data, if that helps smoother node operation and software development.

Alexey · January 21, 2025, 4:48am

Sounds like you mixed the inbound connections which must be protected with your own connections, which you makes (outbound). Since the node is p2p software, it cannot work normally, if you block or filter outbound connections or block inbound connections to the node’s port. If you want to allow your node working normally, then please disable all outbound filters/blocks, they are meaningless for p2p.

The best documentation which we have for the statistics agents is our Open Source code on GitHub.
But here are also design and readme:

github.com

storj/monkit-jaeger/blob/52b0792fa6cd5b96ed55a2a94edbfa50ec67cf1a/README.md?plain=1

# monkit-jaeger

A plugin for http://github.com/spacemonkeygo/monkit that supports Jaeger.

## development

Thrift helpers are generated with:

```
thrift -r --gen 'go:package_prefix=storj.io/monkit-jaeger/gen-go/' agent.thrift
```

## License

Copyright (C) 2020 Storj Labs, Inc.
Copyright (C) 2016 Space Monkey, Inc.

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

This file has been truncated. show original

github.com

storj/design-docs/blob/f6b8e7b3124326228ee4ae0e81e06e3d5007edef/20200213-distributed-tracing.md?plain=1

tags: []
---

# Distributed Tracing

## Abstract

This document details why implementing distributed tracing can help us troubleshoot production issues and the steps needed to implement.

## Background

Distributed tracing enables the ability to visualize how operations are executed in sequence and in parallel within service boundaries and across services which can greatly help troubleshoot development and production issues. Distributed tracing allows you to begin troubleshooting from a high level instead of the common approach of starting from log files or stack traces which can be less informative to operational staff. Traces can begin from a user client perspective all the way down to the database tier while enabling any process operation to attach metadata along the traced path.

For example, a typical use case of distributed tracing in a web application is that a trace begins at a web browsers HTTP request, attaches pertinent header and request information which propagates to the API service backend and eventually reaches the database layer which attaches SQL query metadata. If anything in this trace behaves slowly or throws an error you can use tools like Zipkin, Jaeger, DataDog or any OpenTracing compatible visualization tool to search and analyze what happened within a trace (request).

## Design

![](https://www.jaegertracing.io/img/architecture-v1.png)
_Figure 1. Typical Jaeger architecture._

This file has been truncated. show original

You can disable this tracing in the configuration (or specify a localhost for example), then it wouldn’t send any statistics about the node behavior.

Kiwwiaq · January 21, 2025, 7:48pm

I have not mixed the in and out. I am discussing the new traffic coming out of the storage node. As long as I properly allow related and established incoming traffic, the node is fine. It is a week of testing today. I see no more recognized traffic and the node behaves and operates as expected.

Thank you for pointing me to the documentation. I will consider to let the telemetry and error collection enabled.

arrogantrabbit · January 21, 2025, 8:33pm

Customers initiate connections to your node. It would not have worked if you did not allow new inbound connections.

Telemetry makes it possible to improve product experience, including for you personally: vendor can only address issues they know about. There are no reasons to disable telemetry, other than performance burden of a poorly implemented one (looking at windows OS). I’d argue, disabling telemetry shall be subject to consideration, not enabling.

Kiwwiaq · January 21, 2025, 10:52pm

Please notice, that I am discussing new outbound traffic initiated by the storj node software only.

It certainly is true, such telemetry enables rapid product improvements, but transparency about such enables trust. A simple network communication port table listed on prerequisites page with short description of each could for sure satisfy not only curious, but advanced operators as well.