Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
135 commits
Select commit Hold shift + click to select a range
33324cf
wip
kamilkisiela Feb 27, 2025
77d9dee
/auth-otel
kamilkisiela Mar 20, 2025
cc1a219
perf
kamilkisiela Mar 21, 2025
b5af63d
comments
kamilkisiela Mar 21, 2025
fe88515
cleanups
kamilkisiela Mar 21, 2025
6e79d21
Add react-scan in local development mode
kamilkisiela Mar 21, 2025
162bdf0
asdasd
kamilkisiela Mar 21, 2025
14eb70a
ye
kamilkisiela Mar 22, 2025
b87c72b
asd
kamilkisiela Mar 22, 2025
6b02c6c
otel database migrations
n1ru4l May 28, 2025
a98d22f
sort columns and indices
n1ru4l May 28, 2025
d351d25
lets include codes
n1ru4l May 28, 2025
03c76cc
prefix spans with `hive`
n1ru4l May 30, 2025
8ecb946
fix typo
n1ru4l May 30, 2025
7b4776e
lol
n1ru4l May 30, 2025
9037589
fix typos
n1ru4l May 30, 2025
add23a8
wip
n1ru4l Jun 2, 2025
fcde2ac
add example sdl
n1ru4l Jun 2, 2025
32007da
wip
n1ru4l Jun 4, 2025
de3e4e1
moar wip
n1ru4l Jun 6, 2025
8efe2be
moar wip
n1ru4l Jun 6, 2025
f3787fe
ui consistency
n1ru4l Jun 10, 2025
26bdca0
real data
n1ru4l Jun 10, 2025
5b72a21
subgraph execute
n1ru4l Jun 26, 2025
2609624
this seems more smart
n1ru4l Jun 26, 2025
1ed5197
wip
n1ru4l Jun 27, 2025
43e60a2
filter based on duration and period
n1ru4l Jun 30, 2025
fbc0c86
implement reset timeline
n1ru4l Jun 30, 2025
15ff96d
avoid `[""]`
n1ru4l Jun 30, 2025
889b41b
filter by error code
n1ru4l Jun 30, 2025
d2acec9
all them filters work
n1ru4l Jun 30, 2025
4f6edbd
reset state
n1ru4l Jun 30, 2025
e3f2546
time buckets
n1ru4l Jul 3, 2025
3dc2e55
add a script for seeding otel data
n1ru4l Jul 3, 2025
49f495c
typo
n1ru4l Jul 3, 2025
719812d
enable operation name and operation type filter
n1ru4l Jul 4, 2025
6e8755c
how about we show the real operation
n1ru4l Jul 4, 2025
6e8be2d
reuse components
n1ru4l Jul 4, 2025
67803e0
???
n1ru4l Jul 4, 2025
8f9fd31
store hash
n1ru4l Jul 7, 2025
b1bbecb
client and hash
n1ru4l Jul 7, 2025
9485b8f
improve seed
n1ru4l Jul 7, 2025
98df8ef
たいへん
n1ru4l Jul 7, 2025
3283fe3
error codes
n1ru4l Jul 7, 2025
5988fd7
each trace shall be unique
n1ru4l Jul 7, 2025
d59b37e
fix scroll area unlimited width
n1ru4l Jul 7, 2025
e6fa908
some security
n1ru4l Jul 7, 2025
ce757bd
span events
n1ru4l Jul 8, 2025
1d943cc
better like this
n1ru4l Jul 8, 2025
ec69250
fix status ok/error
n1ru4l Jul 14, 2025
e90b677
fix: show real amount of events
n1ru4l Jul 14, 2025
04d464c
display filtered out requests in diagram
n1ru4l Jul 14, 2025
113db84
feat: dedicated permission for reporting otel traces
n1ru4l Jul 14, 2025
27aa07c
show 100%
n1ru4l Jul 14, 2025
844bf2c
deployment configuration
n1ru4l Jul 14, 2025
ef54287
fix types
n1ru4l Jul 14, 2025
1e85442
feature flag for otel traces
n1ru4l Jul 14, 2025
e05bd70
fix types
n1ru4l Jul 14, 2025
076a69c
fix a bunch of linting issues
n1ru4l Jul 14, 2025
ad9cb5a
fix runtime error
n1ru4l Jul 14, 2025
6cd1e48
feature flag
n1ru4l Jul 15, 2025
73f6987
implement time bucket selection via diagram
n1ru4l Jul 15, 2025
1fc6c0f
save the day
n1ru4l Jul 15, 2025
6851ff2
order by timestamp or duration asc/desc
n1ru4l Jul 15, 2025
7e57d0a
lel
n1ru4l Jul 15, 2025
0ca9fa1
build it
n1ru4l Jul 15, 2025
d1c7c31
rename attributes
n1ru4l Jul 17, 2025
e9fb23f
handle faulty graphql operation traces
n1ru4l Sep 19, 2025
a29004c
some margin
n1ru4l Sep 19, 2025
406e28c
nicer unknown value formatting
n1ru4l Sep 19, 2025
dd4c1de
enable otel-collector self-reporting metrics and scraping
dotansimha Sep 21, 2025
f41f810
added memory limiter
dotansimha Sep 21, 2025
9096c9a
allow unknown oepration type filtering
n1ru4l Sep 22, 2025
75defce
wip: laod test
n1ru4l Sep 22, 2025
48745b1
typefix
n1ru4l Sep 22, 2025
223c016
server preset pls
n1ru4l Sep 22, 2025
7d767b9
single
n1ru4l Sep 22, 2025
8951481
not my type
n1ru4l Sep 22, 2025
f1a5cf4
if we use it it must be there :)
n1ru4l Sep 22, 2025
c8178ff
lel
n1ru4l Sep 23, 2025
5b36999
jeeee
n1ru4l Sep 23, 2025
c28b7d1
ooops
n1ru4l Sep 23, 2025
34a786d
int
n1ru4l Sep 23, 2025
d7165c1
use up 2 date attributes
n1ru4l Sep 23, 2025
5fa83f4
remove the rouge span
n1ru4l Sep 24, 2025
23a29d7
adjust fixtures
n1ru4l Sep 24, 2025
419ff02
view operation for subgraph call
n1ru4l Sep 24, 2025
5b2e8ce
tweak some log
n1ru4l Sep 24, 2025
7291f92
inline auth extension
n1ru4l Sep 24, 2025
6dbb5a4
slight config tweaking
n1ru4l Sep 25, 2025
2b2c819
use unknown everywhere
n1ru4l Sep 25, 2025
0825082
loading state
n1ru4l Sep 25, 2025
f77b90a
feat: pagination
n1ru4l Sep 25, 2025
92f2288
lint
n1ru4l Sep 26, 2025
17ef771
feat: page header for traces
n1ru4l Sep 26, 2025
8b64b4f
i got that sticky icky
n1ru4l Sep 26, 2025
ee6837f
improve trace detail view header
n1ru4l Sep 26, 2025
8157b56
404 component not found
n1ru4l Sep 26, 2025
9b9ec19
lint
n1ru4l Sep 26, 2025
be29f3a
improve bucket selection script
n1ru4l Sep 26, 2025
8ebc467
improve buckets
n1ru4l Sep 30, 2025
8321dde
time bucket end
n1ru4l Oct 1, 2025
1788051
interval
n1ru4l Oct 1, 2025
fe84101
lint
n1ru4l Oct 1, 2025
8a54201
fix not found state flashing
n1ru4l Oct 1, 2025
94d1785
fix bucket end
n1ru4l Oct 1, 2025
3fadf3e
use date range filter
n1ru4l Oct 1, 2025
bda1ef4
quick range fix
n1ru4l Oct 2, 2025
5b4638f
empty state events
n1ru4l Oct 2, 2025
9f5f4bd
cleanup and linting
n1ru4l Oct 2, 2025
5223c16
cleanup
n1ru4l Oct 2, 2025
403b932
show quick filter selections as label
n1ru4l Oct 7, 2025
17dd91d
use config
n1ru4l Oct 8, 2025
fd1d910
back off
n1ru4l Oct 8, 2025
51fbbf7
no low cardinality
n1ru4l Oct 8, 2025
ce3521f
stable af
n1ru4l Oct 8, 2025
e85298b
DateTime64
n1ru4l Oct 8, 2025
66fbeec
typo
n1ru4l Oct 8, 2025
62dc67c
we dont need that
n1ru4l Oct 8, 2025
98f42be
remove docs
n1ru4l Oct 8, 2025
87cb181
dont leak my localhost
n1ru4l Oct 8, 2025
066ef7d
meh
n1ru4l Oct 8, 2025
752c778
fix: implementation
n1ru4l Oct 9, 2025
15e697d
fix: only use jsurl2 for traces route
n1ru4l Oct 9, 2025
601dd34
lean resolvers
n1ru4l Oct 10, 2025
fbba5d1
remove unneeded resolvers
n1ru4l Oct 10, 2025
3e4f81d
lint
n1ru4l Oct 10, 2025
676ccac
i should comit this too :)
n1ru4l Oct 10, 2025
e26937f
Merge remote-tracking branch 'origin/main' into kamil-otel
n1ru4l Oct 10, 2025
3397e81
hm
n1ru4l Oct 10, 2025
a0f0b1c
local hiveauth extension metadata
enisdenjo Oct 10, 2025
1366d4a
revert healtcheckextension
enisdenjo Oct 10, 2025
30b4229
local extension with package name
enisdenjo Oct 10, 2025
24c22c5
clean up
enisdenjo Oct 10, 2025
909be46
typo
enisdenjo Oct 10, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
17 changes: 17 additions & 0 deletions configs/gateway.config.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
import { defineConfig } from '@graphql-hive/gateway';
import { hiveTracingSetup } from '@graphql-hive/plugin-opentelemetry/setup';
import { AsyncLocalStorageContextManager } from '@opentelemetry/context-async-hooks'; // install

hiveTracingSetup({
contextManager: new AsyncLocalStorageContextManager(),
target: process.env.HIVE_TRACING_TARGET!,
accessToken: process.env.HIVE_TRACING_ACCESS_TOKEN!,
// optional, for self-hosting
endpoint: process.env.HIVE_TRACING_ENDPOINT!,
});

export const gatewayConfig = defineConfig({
openTelemetry: {
traces: true,
},
});
12 changes: 12 additions & 0 deletions deployment/index.ts
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@ import { configureGithubApp } from './services/github';
import { deployGraphQL } from './services/graphql';
import { deployKafka } from './services/kafka';
import { deployObservability } from './services/observability';
import { deployOTELCollector } from './services/otel-collector';
import { deploySchemaPolicy } from './services/policy';
import { deployPostgres } from './services/postgres';
import { deployProxy } from './services/proxy';
Expand Down Expand Up @@ -278,6 +279,15 @@ if (hiveAppPersistedDocumentsAbsolutePath && RUN_PUBLISH_COMMANDS) {
});
}

const otelCollector = deployOTELCollector({
environment,
graphql,
dbMigrations,
clickhouse,
image: docker.factory.getImageId('otel-collector', imagesTag),
docker,
});

const app = deployApp({
environment,
graphql,
Expand Down Expand Up @@ -306,6 +316,7 @@ const proxy = deployProxy({
usage,
environment,
publicGraphQLAPIGateway,
otelCollector,
});

deployCloudFlareSecurityTransform({
Expand All @@ -332,4 +343,5 @@ export const schemaApiServiceId = schema.service.id;
export const webhooksApiServiceId = webhooks.service.id;

export const appId = app.deployment.id;
export const otelCollectorId = otelCollector.deployment.id;
export const publicIp = proxy.get()!.status.loadBalancer.ingress[0].ip;
5 changes: 5 additions & 0 deletions deployment/services/environment.ts
Original file line number Diff line number Diff line change
Expand Up @@ -79,6 +79,11 @@ export function prepareEnvironment(input: {
cpuLimit: isProduction ? '512m' : '150m',
memoryLimit: isProduction ? '1000Mi' : '300Mi',
},
tracingCollector: {
cpuLimit: isProduction ? '1000m' : '100m',
memoryLimit: isProduction ? '2000Mi' : '200Mi',
maxReplicas: isProduction || isStaging ? 3 : 1,
},
},
};
}
Expand Down
60 changes: 60 additions & 0 deletions deployment/services/otel-collector.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,60 @@
import { serviceLocalEndpoint } from '../utils/local-endpoint';
import { ServiceDeployment } from '../utils/service-deployment';
import { Clickhouse } from './clickhouse';
import { DbMigrations } from './db-migrations';
import { Docker } from './docker';
import { Environment } from './environment';
import { GraphQL } from './graphql';

export type OTELCollector = ReturnType<typeof deployOTELCollector>;

export function deployOTELCollector(args: {
image: string;
environment: Environment;
docker: Docker;
clickhouse: Clickhouse;
dbMigrations: DbMigrations;
graphql: GraphQL;
}) {
return new ServiceDeployment(
'otel-collector',
{
image: args.image,
imagePullSecret: args.docker.secret,
env: {
...args.environment.envVars,
HIVE_OTEL_AUTH_ENDPOINT: serviceLocalEndpoint(args.graphql.service).apply(
value => value + '/otel-auth',
),
},
/**
* We are using the healthcheck extension.
* https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/extension/healthcheckextension
*/
probePort: 13133,
readinessProbe: '/',
livenessProbe: '/',
startupProbe: '/',
exposesMetrics: true,
replicas: args.environment.podsConfig.tracingCollector.maxReplicas,
pdb: true,
availabilityOnEveryNode: true,
port: 4318,
memoryLimit: args.environment.podsConfig.tracingCollector.memoryLimit,
autoScaling: {
maxReplicas: args.environment.podsConfig.tracingCollector.maxReplicas,
cpu: {
limit: args.environment.podsConfig.tracingCollector.cpuLimit,
cpuAverageToScale: 80,
},
},
},
[args.clickhouse.deployment, args.clickhouse.service, args.dbMigrations],
)
.withSecret('CLICKHOUSE_HOST', args.clickhouse.secret, 'host')
.withSecret('CLICKHOUSE_PORT', args.clickhouse.secret, 'port')
.withSecret('CLICKHOUSE_USERNAME', args.clickhouse.secret, 'username')
.withSecret('CLICKHOUSE_PASSWORD', args.clickhouse.secret, 'password')
.withSecret('CLICKHOUSE_PROTOCOL', args.clickhouse.secret, 'protocol')
.deploy();
}
11 changes: 11 additions & 0 deletions deployment/services/proxy.ts
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@ import { App } from './app';
import { Environment } from './environment';
import { GraphQL } from './graphql';
import { Observability } from './observability';
import { OTELCollector } from './otel-collector';
import { type PublicGraphQLAPIGateway } from './public-graphql-api-gateway';
import { Usage } from './usage';

Expand All @@ -15,13 +16,15 @@ export function deployProxy({
environment,
observability,
publicGraphQLAPIGateway,
otelCollector,
}: {
observability: Observability;
environment: Environment;
graphql: GraphQL;
app: App;
usage: Usage;
publicGraphQLAPIGateway: PublicGraphQLAPIGateway;
otelCollector: OTELCollector;
}) {
const { tlsIssueName } = new CertManager().deployCertManagerAndIssuer();
const commonConfig = new pulumi.Config('common');
Expand Down Expand Up @@ -113,5 +116,13 @@ export function deployProxy({
requestTimeout: '60s',
retriable: true,
},
{
name: 'otel-traces',
path: '/otel/v1/traces',
customRewrite: '/v1/traces',
service: otelCollector.service,
requestTimeout: '60s',
retriable: true,
},
]);
}
15 changes: 9 additions & 6 deletions deployment/utils/service-deployment.ts
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,8 @@ export class ServiceDeployment {
args?: kx.types.Container['args'];
image: string;
port?: number;
/** Port to use for liveness, startup and readiness probes. */
probePort?: number;
serviceAccountName?: pulumi.Output<string>;
livenessProbe?: string | ProbeConfig;
readinessProbe?: string | ProbeConfig;
Expand Down Expand Up @@ -107,6 +109,7 @@ export class ServiceDeployment {

createPod(asJob: boolean) {
const port = this.options.port || 3000;
const probePort = this.options.probePort ?? port;
const additionalEnv: any[] = normalizeEnv(this.options.env);
const secretsEnv: any[] = normalizeEnvSecrets(this.envSecrets);

Expand All @@ -125,14 +128,14 @@ export class ServiceDeployment {
timeoutSeconds: 5,
httpGet: {
path: this.options.livenessProbe,
port,
port: probePort,
},
}
: {
...this.options.livenessProbe,
httpGet: {
path: this.options.livenessProbe.endpoint,
port,
port: probePort,
},
};
}
Expand All @@ -147,14 +150,14 @@ export class ServiceDeployment {
timeoutSeconds: 5,
httpGet: {
path: this.options.readinessProbe,
port,
port: probePort,
},
}
: {
...this.options.readinessProbe,
httpGet: {
path: this.options.readinessProbe.endpoint,
port,
port: probePort,
},
};
}
Expand All @@ -169,14 +172,14 @@ export class ServiceDeployment {
timeoutSeconds: 10,
httpGet: {
path: this.options.startupProbe,
port,
port: probePort,
},
}
: {
...this.options.startupProbe,
httpGet: {
path: this.options.startupProbe.endpoint,
port,
port: probePort,
},
};
}
Expand Down
30 changes: 30 additions & 0 deletions docker/configs/otel-collector/builder-config.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
dist:
version: 0.122.0
name: otelcol-custom
description: Custom OTel Collector distribution
output_path: ./otelcol-custom

receivers:
- gomod: go.opentelemetry.io/collector/receiver/otlpreceiver v0.122.0

processors:
- gomod: go.opentelemetry.io/collector/processor/batchprocessor v0.122.0
- gomod: go.opentelemetry.io/collector/processor/memorylimiterprocessor v0.122.0
- gomod:
github.com/open-telemetry/opentelemetry-collector-contrib/processor/attributesprocessor
v0.122.0
- gomod:
github.com/open-telemetry/opentelemetry-collector-contrib/processor/filterprocessor v0.122.0

exporters:
- gomod: go.opentelemetry.io/collector/exporter/debugexporter v0.122.0
- gomod:
github.com/open-telemetry/opentelemetry-collector-contrib/exporter/clickhouseexporter v0.122.0

extensions:
- gomod:
github.com/open-telemetry/opentelemetry-collector-contrib/extension/healthcheckextension
v0.122.0
- gomod: github.com/graphql-hive/console/docker/configs/otel-collector/extension-hiveauth v0.0.0
path: ./extension-hiveauth
name: hiveauthextension # when using local extensions, package name is required, otherwise you get "missing import path"
79 changes: 79 additions & 0 deletions docker/configs/otel-collector/config.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,79 @@
extensions:
hiveauth:
endpoint: ${HIVE_OTEL_AUTH_ENDPOINT}
health_check:
endpoint: '0.0.0.0:13133'
receivers:
otlp:
protocols:
grpc:
include_metadata: true
endpoint: '0.0.0.0:4317'
auth:
authenticator: hiveauth
http:
cors:
allowed_origins: ['*']
allowed_headers: ['*']
include_metadata: true
endpoint: '0.0.0.0:4318'
auth:
authenticator: hiveauth
processors:
batch:
timeout: 5s
send_batch_size: 10000
attributes:
actions:
- key: hive.target_id
from_context: auth.targetId
action: insert
memory_limiter:
check_interval: 1s
limit_percentage: 80
spike_limit_percentage: 20
exporters:
debug:
verbosity: detailed
sampling_initial: 5
sampling_thereafter: 200
clickhouse:
endpoint: ${CLICKHOUSE_PROTOCOL}://${CLICKHOUSE_HOST}:${CLICKHOUSE_PORT}?dial_timeout=10s&compress=lz4&async_insert=1
database: default
async_insert: true
username: ${CLICKHOUSE_USERNAME}
password: ${CLICKHOUSE_PASSWORD}
create_schema: false
ttl: 720h
compress: lz4
logs_table_name: otel_logs
traces_table_name: otel_traces
metrics_table_name: otel_metrics
timeout: 5s
retry_on_failure:
enabled: true
initial_interval: 5s
max_interval: 30s
max_elapsed_time: 300s
service:
extensions:
- hiveauth
- health_check
telemetry:
logs:
level: INFO
encoding: json
output_paths: ['stdout']
error_output_paths: ['stderr']
metrics:
address: '0.0.0.0:10254'
pipelines:
traces:
receivers: [otlp]
processors:
- memory_limiter
- attributes
- batch
exporters:
- clickhouse
# - debug
28 changes: 28 additions & 0 deletions docker/configs/otel-collector/extension-hiveauth/config.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
// Copyright The OpenTelemetry Authors
// SPDX-License-Identifier: Apache-2.0

package hiveauthextension // import "github.com/graphql-hive/console/docker/configs/otel-collector/extension-hiveauth"

import (
"errors"
"time"
)

type Config struct {
// Endpoint is the address of the authentication server
Endpoint string `mapstructure:"endpoint"`
// Timeout is the timeout for the HTTP request to the auth service
Timeout time.Duration `mapstructure:"timeout"`
}

func (cfg *Config) Validate() error {
if cfg.Endpoint == "" {
return errors.New("missing endpoint")
}

if cfg.Timeout <= 0 {
return errors.New("timeout must be a positive value")
}

return nil
}
7 changes: 7 additions & 0 deletions docker/configs/otel-collector/extension-hiveauth/doc.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
// Copyright The OpenTelemetry Authors
// SPDX-License-Identifier: Apache-2.0

//go:generate mdatagen metadata.yaml

// Package hiveauthextension accepts HTTP requests and forwards them to an external authentication service.
package hiveauthextension // import "github.com/graphql-hive/console/docker/configs/otel-collector/extension-hiveauth"
Loading
Loading