Pipeline components reference
This page documents the top-level configuration for pipeline components: sources, transforms, sinks, and enrichment tables.
These fields define the structure of your observability data pipeline. Each component is defined as a table within these sections, with component-specific configuration options.
For other top-level configuration options, see:
- Global Options - Global settings like data directories and timezone
- API - Configure Vector's observability API
- Schema - Configure Vector's internal schema system
- Secrets - Configure secrets management
enrichment_tables
optional objectenrichment_tables.*
required objectenrichment_tables.*.file
required objecttype = "file"enrichment_tables.*.file.encoding
required object,Whether or not the file contains column headers.
When set to true, the first row of the CSV file will be read as the header row, and
the values will be used for the names of each column. This is the default behavior.
When set to false, columns are referred to by their numerical index.
true| Option | Description |
|---|---|
csv | Decodes the file as a CSV (comma-separated values) file. |
enrichment_tables.*.file.path
required stringThe path of the enrichment table file.
Currently, only CSV files are supported.
enrichment_tables.*.flush_interval
optional uintThe interval used for making writes visible in the table.
Longer intervals might get better performance,
but there is a longer delay before the data is visible in the table.
Since every TTL scan makes its changes visible, only use this value
if it is shorter than the scan_interval.
By default, all writes are made visible immediately.
type = "memory"enrichment_tables.*.graph
optional objectExtra graph configuration
Configure output for component when generated with graph command
enrichment_tables.*.graph.node_attributes
optional objectNode attributes to add to this component’s node in resulting graph
They are added to the node as provided
enrichment_tables.*.inputs
optional [string]A list of upstream source or transform IDs.
Wildcards (*) are supported.
See configuration for more info.
enrichment_tables.*.internal_metrics
optional objecttype = "memory"enrichment_tables.*.internal_metrics.include_key_tag
optional boolDetermines whether to include the key tag on internal metrics.
This is useful for distinguishing between different keys while monitoring. However, the tag’s cardinality is unbounded.
falseenrichment_tables.*.locale
optional stringThe locale to use when querying the database.
MaxMind includes localized versions of some of the fields within their database, such as country name. This setting can control which of those localized versions are returned by the transform.
More information on which portions of the geolocation data are localized, and what languages are available, can be found here.
type = "geoip"enenrichment_tables.*.max_byte_size
optional uintMaximum size of the table in bytes. All insertions that make this table bigger than the maximum size are rejected.
By default, there is no size limit.
type = "memory"enrichment_tables.*.path
required stringPath to the MaxMind GeoIP2 or GeoLite2 binary city database file (GeoLite2-City.mmdb).
Other databases, such as the country database, are not supported.
mmdb enrichment table can be used for other databases.
type = "geoip" or type = "mmdb"enrichment_tables.*.scan_interval
optional uinttype = "memory"30enrichment_tables.*.schema
optional objectKey/value pairs representing mapped log field names and types.
This is used to coerce log fields from strings into their proper types. The available types are listed in the Types list below.
Timestamp coercions need to be prefaced with timestamp|, for example "timestamp|%F". Timestamp specifiers can use either of the following:
- One of the built-in-formats listed in the
Timestamp Formatstable below. - The time format specifiers from Rust’s
chronolibrary.
Types
boolstringfloatintegerdatetimestamp(see the table below for formats)
Timestamp Formats
| Format | Description | Example |
|---|---|---|
%F %T | YYYY-MM-DD HH:MM:SS | 2020-12-01 02:37:54 |
%v %T | DD-Mmm-YYYY HH:MM:SS | 01-Dec-2020 02:37:54 |
%FT%T | ISO 8601/RFC 3339, without time zone | 2020-12-01T02:37:54 |
%FT%TZ | ISO 8601/RFC 3339, UTC | 2020-12-01T09:37:54Z |
%+ | ISO 8601/RFC 3339, UTC, with time zone | 2020-12-01T02:37:54-07:00 |
%a, %d %b %Y %T | RFC 822/RFC 2822, without time zone | Tue, 01 Dec 2020 02:37:54 |
%a %b %e %T %Y | ctime format | Tue Dec 1 02:37:54 2020 |
%s | UNIX timestamp | 1606790274 |
%a %d %b %T %Y | date command, without time zone | Tue 01 Dec 02:37:54 2020 |
%a %d %b %T %Z %Y | date command, with time zone | Tue 01 Dec 02:37:54 PST 2020 |
%a %d %b %T %z %Y | date command, with numeric time zone | Tue 01 Dec 02:37:54 -0700 2020 |
%a %d %b %T %#z %Y | date command, with numeric time zone (minutes can be missing or present) | Tue 01 Dec 02:37:54 -07 2020 |
type = "file"enrichment_tables.*.schema.*
required stringenrichment_tables.*.source_config
optional objecttype = "memory"enrichment_tables.*.source_config.export_batch_size
optional uintBatch size for data exporting. Used to prevent exporting entire table at once and blocking the system.
By default, batches are not used and entire table is exported.
expired output port.
Expired items ignore other settings and are exported as they are flushed from the table.falseenrichment_tables.*.source_config.export_interval
optional uintIf set to true, all data will be removed from cache after exporting. Only valid if used as a source and export_interval > 0
By default, export will not remove data from cache
falseenrichment_tables.*.source_config.source_key
required stringenrichment_tables.*.ttl
optional uinttype = "memory"600enrichment_tables.*.ttl_field
optional stringtype = "memory"enrichment_tables.*.type
required string enum| Option | Description |
|---|---|
file | Exposes data from a static file as an enrichment table. |
geoip | Exposes data from a MaxMind GeoIP2 database as an enrichment table. |
memory | Exposes data from a memory cache as an enrichment table. The cache can be written to using a sink. |
mmdb | Exposes data from a MaxMind database as an enrichment table. |
sinks
optional objectsinks.*
required objectsinks.*.buffer
optional objectConfigures the buffering behavior for this sink.
More information about the individual buffer types, and buffer behavior, can be found in the Buffering Model section.
sinks.*.buffer.max_events
optional uinttype = "memory"500sinks.*.buffer.max_size
required uintThe maximum allowed amount of allocated memory the buffer can hold.
If type = "disk" then must be at least ~256 megabytes (268435488 bytes).
sinks.*.buffer.type
optional string enum| Option | Description |
|---|---|
disk | Events are buffered on disk. This is less performant, but more durable. Data that has been synchronized to disk will not be lost if Vector is restarted forcefully or crashes. Data is synchronized to disk every 500ms. |
memory | Events are buffered in memory. This is more performant, but less durable. Data will be lost if Vector is restarted forcefully or crashes. |
memorysinks.*.buffer.when_full
optional string enum| Option | Description |
|---|---|
block | Wait for free space in the buffer. This applies backpressure up the topology, signalling that sources should slow down the acceptance/consumption of events. This means that while no data is lost, data will pile up at the edge. |
drop_newest | Drops the event instead of waiting for free space in buffer. The event will be intentionally dropped. This mode is typically used when performance is the highest priority, and it is preferable to temporarily lose events rather than cause a slowdown in the acceptance/consumption of events. |
blocksinks.*.graph
optional objectExtra graph configuration
Configure output for component when generated with graph command
sinks.*.graph.node_attributes
optional objectNode attributes to add to this component’s node in resulting graph
They are added to the node as provided
sinks.*.healthcheck
optional objectsinks.*.healthcheck.enabled
optional booltruesinks.*.healthcheck.timeout
optional float10(seconds)sinks.*.healthcheck.uri
optional stringThe full URI to make HTTP healthcheck requests to.
This must be a valid URI, which requires at least the scheme and host. All other components – port, path, etc – are allowed as well.
sinks.*.inputs
required [string]A list of upstream source or transform IDs.
Wildcards (*) are supported.
See configuration for more info.
sinks.*.proxy
optional objectProxy configuration.
Configure to proxy traffic through an HTTP(S) proxy when making external requests.
Similar to common proxy configuration convention, you can set different proxies to use based on the type of traffic being proxied. You can also set specific hosts that should not be proxied.
sinks.*.proxy.http
optional stringProxy endpoint to use when proxying HTTP traffic.
Must be a valid URI string.
sinks.*.proxy.https
optional stringProxy endpoint to use when proxying HTTPS traffic.
Must be a valid URI string.
sinks.*.proxy.no_proxy
optional [string]A list of hosts to avoid proxying.
Multiple patterns are allowed:
| Pattern | Example match |
|---|---|
| Domain names | example.com matches requests to example.com |
| Wildcard domains | .example.com matches requests to example.com and its subdomains |
| IP addresses | 127.0.0.1 matches requests to 127.0.0.1 |
| CIDR blocks | 192.168.0.0/16 matches requests to any IP addresses in this range |
| Splat | * matches all hosts |
sources
optional objectsources.*
required objectsources.*.graph
optional objectExtra graph configuration
Configure output for component when generated with graph command
sources.*.graph.node_attributes
optional objectNode attributes to add to this component’s node in resulting graph
They are added to the node as provided
sources.*.proxy
optional objectProxy configuration.
Configure to proxy traffic through an HTTP(S) proxy when making external requests.
Similar to common proxy configuration convention, you can set different proxies to use based on the type of traffic being proxied. You can also set specific hosts that should not be proxied.
sources.*.proxy.http
optional stringProxy endpoint to use when proxying HTTP traffic.
Must be a valid URI string.
sources.*.proxy.https
optional stringProxy endpoint to use when proxying HTTPS traffic.
Must be a valid URI string.
sources.*.proxy.no_proxy
optional [string]A list of hosts to avoid proxying.
Multiple patterns are allowed:
| Pattern | Example match |
|---|---|
| Domain names | example.com matches requests to example.com |
| Wildcard domains | .example.com matches requests to example.com and its subdomains |
| IP addresses | 127.0.0.1 matches requests to 127.0.0.1 |
| CIDR blocks | 192.168.0.0/16 matches requests to any IP addresses in this range |
| Splat | * matches all hosts |
transforms
optional objecttransforms.*
required objecttransforms.*.graph
optional objectExtra graph configuration
Configure output for component when generated with graph command
transforms.*.graph.node_attributes
optional objectNode attributes to add to this component’s node in resulting graph
They are added to the node as provided
transforms.*.inputs
required [string]A list of upstream source or transform IDs.
Wildcards (*) are supported.
See configuration for more info.