运维关键日志
GreptimeDB 在运行过程中,会将一些关键的操作以及预期外的错误信息输出到日志中。 你可以通过这些日志了解 GreptimeDB 的运行情况,以及排查错误出现的原因。
日志位置
GreptimeDB 的组件默认都会输出 INFO 级别的日志到以下位置:
- 标准输出
- GreptimeDB 当前工作目录下的
greptimedb_data/logs
目录
日志文件的输出目录也可以通过配置文件的 [logging]
小节或者启动参数 --log_dir
修改:
[logging]
dir = "/path/to/logs"
日志文件格式为:
greptimedb.YYYY-MM-DD-HH
包含 INFO 以上等级的日志greptimedb-err.YYYY-MM-DD-HH
包含错误日志
例如:
greptimedb.2025-04-11-06
greptimedb-err.2025-04-11-06
目前 GreptimeDB 的组件包括
- frontend
- datanode
- metasrv
- flownode
如果需要调整日志级别,如查看某组件 DEBUG 级别的日志,可以参考这篇文档在运行时进行修改。
重要日志
以下将列举建议关注的日志
错误日志
数据库在正常平稳运行时,不会输出错误日志。如果数据库的一些操作出现异常,或者出现了 panic,都会打印错误日志。建议用户检查所有组件的错误日志。
Panic
如果数据库出现了 panic ,则建议收集 panic 的日志反馈给官方。通常 panic 日志如下,关键字为 panicked at
:
2025-04-02T14:44:24.485935Z ERROR common_telemetry::panic_hook: panicked at /greptime/.cargo/git/checkouts/datafusion-11a8b534adb6bd68-shallow/2464703/datafusion/expr/src/logical_plan/plan.rs:1035:25:
with_new_exprs for Distinct does not support sort expressions backtrace= 0: backtrace::backtrace::libunwind::trace
at /greptime/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/backtrace-0.3.74/src/backtrace/libunwind.rs:116:5
backtrace::backtrace::trace_unsynchronized
at /greptime/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/backtrace-0.3.74/src/backtrace/mod.rs:66:5
1: backtrace::backtrace::trace
at /greptime/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/backtrace-0.3.74/src/backtrace/mod.rs:53:14
2: backtrace::capture::Backtrace::create
at /greptime/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/backtrace-0.3.74/src/capture.rs:292:9
3: backtrace::capture::Backtrace::new
at /greptime/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/backtrace-0.3.74/src/capture.rs:257:22
4: common_telemetry::panic_hook::set_panic_hook::{{closure}}
at /greptime/codes/greptime/procedure-traits/src/common/telemetry/src/panic_hook.rs:37:25
5: <alloc::boxed::Box<F,A> as core::ops::function::Fn<Args>>::call
at /rustc/409998c4e8cae45344fd434b358b697cc93870d0/library/alloc/src/boxed.rs:1984:9
std::panicking::rust_panic_with_hook
at /rustc/409998c4e8cae45344fd434b358b697cc93870d0/library/std/src/panicking.rs:820:13
6: std::panicking::begin_panic_handler::{{closure}}
at /rustc/409998c4e8cae45344fd434b358b697cc93870d0/library/std/src/panicking.rs:678:13
7: std::sys::backtrace::__rust_end_short_backtrace
at /rustc/409998c4e8cae45344fd434b358b697cc93870d0/library/std/src/sys/backtrace.rs:168:18
8: rust_begin_unwind
at /rustc/409998c4e8cae45344fd434b358b697cc93870d0/library/std/src/panicking.rs:676:5
9: core::panicking::panic_fmt
at /rustc/409998c4e8cae45344fd434b358b697cc93870d0/library/core/src/panicking.rs:75:14
10: datafusion_expr::logical_plan::plan::LogicalPlan::with_new_exprs
at /greptime/.cargo/git/checkouts/datafusion-11a8b534adb6bd68-shallow/2464703/datafusion/expr/src/logical_plan/plan.rs:1035:25
11: <query::optimizer::type_conversion::TypeConversionRule as query::optimizer::ExtensionAnalyzerRule>::analyze::{{closure}}
at /greptime/codes/greptime/procedure-traits/src/query/src/optimizer/type_conversion.rs:105:17
12: core::ops::function::impls::<impl core::ops::function::FnMut<A> for &F>::call_mut
at /greptime/.rustup/toolchains/nightly-2024-12-25-aarch64-apple-darwin/lib/rustlib/src/rust/library/core/src/ops/function.rs:272:13
13: core::ops::function::impls::<impl core::ops::function::FnOnce<A> for &mut F>::call_once
at /greptime/.rustup/toolchains/nightly-2024-12-25-aarch64-apple-darwin/lib/rustlib/src/rust/library/core/src/ops/function.rs:305:13
14: datafusion_common::tree_node::Transformed<T>::transform_parent
at /greptime/.cargo/git/checkouts/datafusion-11a8b534adb6bd68-shallow/2464703/datafusion/common/src/tree_node.rs:764:44
15: datafusion_common::tree_node::TreeNode::transform_up::transform_up_impl::{{closure}}
at /greptime/.cargo/git/checkouts/datafusion-11a8b534adb6bd68-shallow/2464703/datafusion/common/src/tree_node.rs:265:13
16: stacker::maybe_grow
at /greptime/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/stacker-0.1.17/src/lib.rs:55:9
datafusion_common::tree_node::TreeNode::transform_up::transform_up_impl
at /greptime/.cargo/git/checkouts/datafusion-11a8b534adb6bd68-shallow/2464703/datafusion/common/src/tree_node.rs:260:9
17: datafusion_common::tree_node::TreeNode::transform_up
at /greptime/.cargo/git/checkouts/datafusion-11a8b534adb6bd68-shallow/2464703/datafusion/common/src/tree_node.rs:269:9
18: datafusion_common::tree_node::TreeNode::transform
at /greptime/.cargo/git/checkouts/datafusion-11a8b534adb6bd68-shallow/2464703/datafusion/common/src/tree_node.rs:220:9
19: <query::optimizer::type_conversion::TypeConversionRule as query::optimizer::ExtensionAnalyzerRule>::analyze
at /greptime/codes/greptime/procedure-traits/src/query/src/optimizer/type_conversion.rs:46:9
20: query::query_engine::state::QueryEngineState::optimize_by_extension_rules::{{closure}}
at /greptime/codes/greptime/procedure-traits/src/query/src/query_engine/state.rs:195:17
21: core::iter::traits::iterator::Iterator::try_fold
at /greptime/.rustup/toolchains/nightly-2024-12-25-aarch64-apple-darwin/lib/rustlib/src/rust/library/core/src/iter/traits/iterator.rs:2370:21
22: query::query_engine::state::QueryEngineState::optimize_by_extension_rules
at /greptime/codes/greptime/procedure-traits/src/query/src/query_engine/state.rs:192:9
23: query::planner::DfLogicalPlanner::plan_sql::{{closure}}::{{closure}}
at /greptime/codes/greptime/procedure-traits/src/query/src/planner.rs:119:20
24: <tracing::instrument::Instrumented<T> as core::future::future::Future>::poll
at /greptime/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/tracing-0.1.40/src/instrument.rs:321:9
25: query::planner::DfLogicalPlanner::plan_sql::{{closure}}
at /greptime/codes/greptime/procedure-traits/src/query/src/planner.rs:71:5
26: <query::planner::DfLogicalPlanner as query::planner::LogicalPlanner>::plan::{{closure}}::{{closure}}
at /greptime/codes/greptime/procedure-traits/src/query/src/planner.rs:198:73
27: <tracing::instrument::Instrumented<T> as core::future::future::Future>::poll
at /greptime/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/tracing-0.1.40/src/instrument.rs:321:9
28: <query::planner::DfLogicalPlanner as query::planner::LogicalPlanner>::plan::{{closure}}
at /greptime/codes/greptime/procedure-traits/src/query/src/planner.rs:195:5
...
Metasrv
当 GreptimeDB 集群出现节点上线,节点下线,region 迁移,schema 变更等情况时, metasrv 都会记录相应的日志。因此,除了各个组件的错误日志外,也建议关注 metasrv 的以下日志关键字
Metasrv 切主/发起选举
// error 级别,标识当前 leader step down, 接下来会发生新的选举,注意 {:?} 这部分是 leader 标识
"Leader :{:?} step down"