Collect vpcflow logs from Google Cloud Platform (GCP) with Elastic Agent
What is an Elastic integration?
This integration is powered by Elastic Agent. Elastic Agent is a single, unified way to add monitoring for logs, metrics, and other types of data to a host. It can also protect hosts from security threats, query data from operating systems, forward data from remote services or hardware, and more. Refer to our documentation for a detailed comparison between Beats and Elastic Agent.
Prefer to use Beats for this use case? See Filebeat modules for logs or Metricbeat modules for metrics.
See the integrations quick start guides to get started:
The vpcflow
dataset collects logs sent from and received by VM instances, including instances used as GKE nodes.
Exported fields
Field | Description | Type |
---|---|---|
@timestamp | Event timestamp. | date |
cloud.account.id | The cloud account or organization id used to identify different entities in a multi-tenant environment. Examples: AWS account id, Google Cloud ORG Id, or other unique identifier. | keyword |
cloud.availability_zone | Availability zone in which this host is running. | keyword |
cloud.image.id | Image ID for the cloud instance. | keyword |
cloud.instance.id | Instance ID of the host machine. | keyword |
cloud.instance.name | Instance name of the host machine. | keyword |
cloud.machine.type | Machine type of the host machine. | keyword |
cloud.project.id | Name of the project in Google Cloud. | keyword |
cloud.provider | Name of the cloud provider. Example values are aws, azure, gcp, or digitalocean. | keyword |
cloud.region | Region in which this host is running. | keyword |
container.id | Unique container id. | keyword |
container.image.name | Name of the image the container was built on. | keyword |
container.labels | Image labels. | object |
container.name | Container name. | keyword |
container.runtime | Runtime managing this container. | keyword |
data_stream.dataset | Data stream dataset. | constant_keyword |
data_stream.namespace | Data stream namespace. | constant_keyword |
data_stream.type | Data stream type. | constant_keyword |
destination.address | Some event destination addresses are defined ambiguously. The event will sometimes list an IP, a domain or a unix socket. You should always store the raw address in the .address field. Then it should be duplicated to .ip or .domain , depending on which one it is. | keyword |
destination.as.number | Unique number allocated to the autonomous system. The autonomous system number (ASN) uniquely identifies each network on the Internet. | long |
destination.as.organization.name | Organization name. | keyword |
destination.as.organization.name.text | Multi-field of destination.as.organization.name . | match_only_text |
destination.domain | The domain name of the destination system. This value may be a host name, a fully qualified domain name, or another host naming format. The value may derive from the original event or be added from enrichment. | keyword |
destination.geo.city_name | City name. | keyword |
destination.geo.continent_name | Name of the continent. | keyword |
destination.geo.country_iso_code | Country ISO code. | keyword |
destination.geo.country_name | Country name. | keyword |
destination.geo.location | Longitude and latitude. | geo_point |
destination.geo.name | User-defined description of a location, at the level of granularity they care about. Could be the name of their data centers, the floor number, if this describes a local physical entity, city names. Not typically used in automated geolocation. | keyword |
destination.geo.region_iso_code | Region ISO code. | keyword |
destination.geo.region_name | Region name. | keyword |
destination.ip | IP address of the destination (IPv4 or IPv6). | ip |
destination.port | Port of the destination. | long |
ecs.version | ECS version this event conforms to. ecs.version is a required field and must exist in all events. When querying across multiple indices -- which may conform to slightly different ECS versions -- this field lets integrations adjust to the schema version of the events. | keyword |
event.action | The action captured by the event. This describes the information in the event. It is more specific than event.category . Examples are group-add , process-started , file-created . The value is normally defined by the implementer. | keyword |
event.category | This is one of four ECS Categorization Fields, and indicates the second level in the ECS category hierarchy. event.category represents the "big buckets" of ECS categories. For example, filtering on event.category:process yields all events relating to process activity. This field is closely related to event.type , which is used as a subcategory. This field is an array. This will allow proper categorization of some events that fall in multiple categories. | keyword |
event.created | event.created contains the date/time when the event was first read by an agent, or by your pipeline. This field is distinct from @timestamp in that @timestamp typically contain the time extracted from the original event. In most situations, these two timestamps will be slightly different. The difference can be used to calculate the delay between your source generating an event, and the time when your agent first processed it. This can be used to monitor your agent's or pipeline's ability to keep up with your event source. In case the two timestamps are identical, @timestamp should be used. | date |
event.dataset | Event dataset | constant_keyword |
event.end | event.end contains the date when the event ended or when the activity was last observed. | date |
event.id | Unique ID to describe the event. | keyword |
event.ingested | Timestamp when an event arrived in the central data store. This is different from @timestamp , which is when the event originally occurred. It's also different from event.created , which is meant to capture the first time an agent saw the event. In normal conditions, assuming no tampering, the timestamps should chronologically look like this: @timestamp < event.created < event.ingested . | date |
event.kind | This is one of four ECS Categorization Fields, and indicates the highest level in the ECS category hierarchy. event.kind gives high-level information about what type of information the event contains, without being specific to the contents of the event. For example, values of this field distinguish alert events from metric events. The value of this field can be used to inform how these kinds of events should be handled. They may warrant different retention, different access control, it may also help understand whether the data coming in at a regular interval or not. | keyword |
event.module | Event module | constant_keyword |
event.original | Raw text message of entire event. Used to demonstrate log integrity or where the full log message (before splitting it up in multiple parts) may be required, e.g. for reindex. This field is not indexed and doc_values are disabled. It cannot be searched, but it can be retrieved from _source . If users wish to override this and index this field, please see Field data types in the Elasticsearch Reference . | keyword |
event.outcome | This is one of four ECS Categorization Fields, and indicates the lowest level in the ECS category hierarchy. event.outcome simply denotes whether the event represents a success or a failure from the perspective of the entity that produced the event. Note that when a single transaction is described in multiple events, each event may populate different values of event.outcome , according to their perspective. Also note that in the case of a compound event (a single event that contains multiple logical events), this field should be populated with the value that best captures the overall success or failure from the perspective of the event producer. Further note that not all events will have an associated outcome. For example, this field is generally not populated for metric events, events with event.type:info , or any events for which an outcome does not make logical sense. | keyword |
event.start | event.start contains the date when the event started or when the activity was first observed. | date |
event.type | This is one of four ECS Categorization Fields, and indicates the third level in the ECS category hierarchy. event.type represents a categorization "sub-bucket" that, when used along with the event.category field values, enables filtering events down to a level appropriate for single visualization. This field is an array. This will allow proper categorization of some events that fall in multiple event types. | keyword |
gcp.destination.instance.project_id | ID of the project containing the VM. | keyword |
gcp.destination.instance.region | Region of the VM. | keyword |
gcp.destination.instance.zone | Zone of the VM. | keyword |
gcp.destination.vpc.project_id | ID of the project containing the VM. | keyword |
gcp.destination.vpc.subnetwork_name | Subnetwork on which the VM is operating. | keyword |
gcp.destination.vpc.vpc_name | VPC on which the VM is operating. | keyword |
gcp.source.instance.project_id | ID of the project containing the VM. | keyword |
gcp.source.instance.region | Region of the VM. | keyword |
gcp.source.instance.zone | Zone of the VM. | keyword |
gcp.source.vpc.project_id | ID of the project containing the VM. | keyword |
gcp.source.vpc.subnetwork_name | Subnetwork on which the VM is operating. | keyword |
gcp.source.vpc.vpc_name | VPC on which the VM is operating. | keyword |
gcp.vpcflow.reporter | The side which reported the flow. Can be either 'SRC' or 'DEST'. | keyword |
gcp.vpcflow.rtt.ms | Latency as measured (for TCP flows only) during the time interval. This is the time elapsed between sending a SEQ and receiving a corresponding ACK and it contains the network RTT as well as the application related delay. | long |
host.architecture | Operating system architecture. | keyword |
host.containerized | If the host is a container. | boolean |
host.domain | Name of the domain of which the host is a member. For example, on Windows this could be the host's Active Directory domain or NetBIOS domain name. For Linux this could be the domain of the host's LDAP provider. | keyword |
host.hostname | Hostname of the host. It normally contains what the hostname command returns on the host machine. | keyword |
host.id | Unique host id. As hostname is not always unique, use values that are meaningful in your environment. Example: The current usage of beat.name . | keyword |
host.ip | Host ip addresses. | ip |
host.mac | Host mac addresses. | keyword |
host.name | Name of the host. It can contain what hostname returns on Unix systems, the fully qualified domain name, or a name specified by the user. The sender decides which value to use. | keyword |
host.os.build | OS build information. | keyword |
host.os.codename | OS codename, if any. | keyword |
host.os.family | OS family (such as redhat, debian, freebsd, windows). | keyword |
host.os.kernel | Operating system kernel version as a raw string. | keyword |
host.os.name | Operating system name, without the version. | keyword |
host.os.name.text | Multi-field of host.os.name . | text |
host.os.platform | Operating system platform (such centos, ubuntu, windows). | keyword |
host.os.version | Operating system version as a raw string. | keyword |
host.type | Type of host. For Cloud providers this can be the machine type like t2.medium . If vm, this could be the container, for example, or other information meaningful in your environment. | keyword |
input.type | Input type | keyword |
log.file.path | Full path to the log file this event came from, including the file name. It should include the drive letter, when appropriate. If the event wasn't read from a log file, do not populate this field. | keyword |
log.logger | The name of the logger inside an application. This is usually the name of the class which initialized the logger, or can be a custom name. | keyword |
log.offset | Log offset | long |
message | For log events the message field contains the log message, optimized for viewing in a log viewer. For structured logs without an original message field, other fields can be concatenated to form a human-readable summary of the event. If multiple messages exist, they can be combined into one message. | match_only_text |
network.bytes | Total bytes transferred in both directions. If source.bytes and destination.bytes are known, network.bytes is their sum. | long |
network.community_id | A hash of source and destination IPs and ports, as well as the protocol used in a communication. This is a tool-agnostic standard to identify flows. Learn more at https://github.com/corelight/community-id-spec. | keyword |
network.direction | Direction of the network traffic. When mapping events from a host-based monitoring context, populate this field from the host's point of view, using the values "ingress" or "egress". When mapping events from a network or perimeter-based monitoring context, populate this field from the point of view of the network perimeter, using the values "inbound", "outbound", "internal" or "external". Note that "internal" is not crossing perimeter boundaries, and is meant to describe communication between two hosts within the perimeter. Note also that "external" is meant to describe traffic between two hosts that are external to the perimeter. This could for example be useful for ISPs or VPN service providers. | keyword |
network.iana_number | IANA Protocol Number (https://www.iana.org/assignments/protocol-numbers/protocol-numbers.xhtml). Standardized list of protocols. This aligns well with NetFlow and sFlow related logs which use the IANA Protocol Number. | keyword |
network.name | Name given by operators to sections of their network. | keyword |
network.packets | Total packets transferred in both directions. If source.packets and destination.packets are known, network.packets is their sum. | long |
network.transport | Same as network.iana_number, but instead using the Keyword name of the transport layer (udp, tcp, ipv6-icmp, etc.) The field value must be normalized to lowercase for querying. | keyword |
network.type | In the OSI Model this would be the Network Layer. ipv4, ipv6, ipsec, pim, etc The field value must be normalized to lowercase for querying. | keyword |
related.hash | All the hashes seen on your event. Populating this field, then using it to search for hashes can help in situations where you're unsure what the hash algorithm is (and therefore which key name to search). | keyword |
related.hosts | All hostnames or other host identifiers seen on your event. Example identifiers include FQDNs, domain names, workstation names, or aliases. | keyword |
related.ip | All of the IPs seen on your event. | ip |
related.user | All the user names or other user identifiers seen on the event. | keyword |
rule.name | The name of the rule or signature generating the event. | keyword |
source.address | Some event source addresses are defined ambiguously. The event will sometimes list an IP, a domain or a unix socket. You should always store the raw address in the .address field. Then it should be duplicated to .ip or .domain , depending on which one it is. | keyword |
source.as.number | Unique number allocated to the autonomous system. The autonomous system number (ASN) uniquely identifies each network on the Internet. | long |
source.as.organization.name | Organization name. | keyword |
source.as.organization.name.text | Multi-field of source.as.organization.name . | match_only_text |
source.bytes | Bytes sent from the source to the destination. | long |
source.domain | The domain name of the source system. This value may be a host name, a fully qualified domain name, or another host naming format. The value may derive from the original event or be added from enrichment. | keyword |
source.geo.city_name | City name. | keyword |
source.geo.continent_name | Name of the continent. | keyword |
source.geo.country_iso_code | Country ISO code. | keyword |
source.geo.country_name | Country name. | keyword |
source.geo.location | Longitude and latitude. | geo_point |
source.geo.name | User-defined description of a location, at the level of granularity they care about. Could be the name of their data centers, the floor number, if this describes a local physical entity, city names. Not typically used in automated geolocation. | keyword |
source.geo.region_iso_code | Region ISO code. | keyword |
source.geo.region_name | Region name. | keyword |
source.ip | IP address of the source (IPv4 or IPv6). | ip |
source.packets | Packets sent from the source to the destination. | long |
source.port | Port of the source. | long |
tags | List of keywords used to tag each event. | keyword |
An example event for vpcflow
looks as following:
{
"@timestamp": "2019-06-14T03:50:10.845Z",
"agent": {
"ephemeral_id": "f4dde373-2ff7-464b-afdb-da94763f219b",
"id": "5d3eee86-91a9-4afa-af92-c6b79bd866c0",
"name": "docker-fleet-agent",
"type": "filebeat",
"version": "8.6.0"
},
"cloud": {
"provider": "gcp"
},
"data_stream": {
"dataset": "gcp.vpcflow",
"namespace": "ep",
"type": "logs"
},
"destination": {
"address": "10.87.40.76",
"domain": "kibana",
"ip": "10.87.40.76",
"port": 5601
},
"ecs": {
"version": "8.7.0"
},
"elastic_agent": {
"id": "5d3eee86-91a9-4afa-af92-c6b79bd866c0",
"snapshot": true,
"version": "8.6.0"
},
"event": {
"agent_id_status": "verified",
"category": "network",
"created": "2023-01-13T15:03:19.118Z",
"dataset": "gcp.vpcflow",
"end": "2019-06-14T03:40:37.048196137Z",
"id": "ut8lbrffooxzf",
"ingested": "2023-01-13T15:03:20Z",
"kind": "event",
"start": "2019-06-14T03:40:36.895188084Z",
"type": "connection"
},
"gcp": {
"destination": {
"instance": {
"project_id": "my-sample-project",
"region": "us-east1",
"zone": "us-east1-b"
},
"vpc": {
"project_id": "my-sample-project",
"subnetwork_name": "default",
"vpc_name": "default"
}
},
"vpcflow": {
"reporter": "DEST",
"rtt": {
"ms": 36
}
}
},
"input": {
"type": "gcp-pubsub"
},
"log": {
"logger": "projects/my-sample-project/logs/compute.googleapis.com%2Fvpc_flows"
},
"network": {
"bytes": 1464,
"community_id": "1:++9/JiESSUdwTGGcxwXk4RA0lY8=",
"direction": "inbound",
"iana_number": "6",
"packets": 7,
"transport": "tcp",
"type": "ipv4"
},
"related": {
"ip": [
"192.168.2.117",
"10.87.40.76"
]
},
"source": {
"address": "192.168.2.117",
"as": {
"number": 15169
},
"bytes": 1464,
"geo": {
"continent_name": "America",
"country_name": "usa"
},
"ip": "192.168.2.117",
"packets": 7,
"port": 50646
},
"tags": [
"forwarded",
"gcp-vpcflow"
]
}
Version | Details |
---|---|
2.20.0 | Enhancement View pull request Update package to ECS 8.7.0. |
2.19.1 | Enhancement View pull request Migrate compute dashboard to lens and add datastream filter. |
2.19.0 | Enhancement View pull request Add Cloud Run metrics datastream. |
2.18.0 | Enhancement View pull request Support subscription_num_goroutines and subscription_max_outstanding_messages for GCP PubSub input |
2.17.2 | Bug fix View pull request Fix IP Convert processor in Audit ingest pipeline. |
2.17.1 | Enhancement View pull request Added categories and/or subcategories. |
2.17.0 | Enhancement View pull request Add Audit Log Overview dashboard Enhancement View pull request Add GKE Overview dashboard Enhancement View pull request Add PubSub Overview dashboard Enhancement View pull request Add Storage Overview dashboard |
2.16.2 | Bug fix View pull request Add logic to handle scalar request.policy values on audit |
2.16.1 | Bug fix View pull request Replace missing input control panel with new-style control. |
2.16.0 | Enhancement View pull request Update package to ECS 8.6.0. |
2.15.2 | Enhancement View pull request Update documentation. |
2.15.1 | Enhancement View pull request Add GCP Compute pipeline test. |
2.15.0 | Enhancement View pull request Remove support for Kibana 7.17.x Enhancement View pull request Support multiple regions for metrics data streams |
2.14.0 | Enhancement View pull request Update package to ECS 8.5.0. |
2.13.0 | Enhancement View pull request Migrate dashboard by values |
2.12.1 | Bug fix View pull request Remove duplicate fields. |
2.12.0 | Enhancement View pull request Add GCP Redis |
2.11.12 | Bug fix View pull request Add GKE ingest pipeline. |
2.11.11 | Bug fix View pull request Fix type of dns.answers.ttl. |
2.11.10 | Enhancement View pull request Add ingest pipeline for dataproc. Enhancement View pull request Add GCP loadbalancing ingest pipeline Enhancement View pull request Add GCP PubSub ingest pipeline Enhancement View pull request Add GCP Storage ingest pipeline Enhancement View pull request Add GCP Firestore ingest pipeline Enhancement View pull request Add GCP Compute ingest pipeline |
2.11.10-beta.6 | Enhancement View pull request Add ingest pipeline for dataproc. |
2.11.10-beta.5 | Enhancement View pull request Add GCP loadbalancing ingest pipeline |
2.11.10-beta.4 | Enhancement View pull request Add GCP PubSub ingest pipeline |
2.11.10-beta.3 | Enhancement View pull request Add GCP Storage ingest pipeline |
2.11.10-beta.2 | Enhancement View pull request Add GCP Firestore ingest pipeline |
2.11.10-beta.1 | Enhancement View pull request Add GCP Compute ingest pipeline |
2.11.9 | Bug fix View pull request Fix GKE kubernetes.io indentation. |
2.11.8 | Enhancement View pull request Remove duplicate fields. |
2.11.7 | Enhancement View pull request Move Dataproc lightweight module config into integration |
2.11.6 | Enhancement View pull request Move LoadBalancing lightweight module config into integration |
2.11.5 | Enhancement View pull request Move Storage lightweight module config into integration |
2.11.4 | Enhancement View pull request Move PubSub lightweight module config into integration |
2.11.3 | Enhancement View pull request Move GKE lightweight module config into integration |
2.11.2 | Enhancement View pull request Move Firestore lightweight module config into integration |
2.11.1 | Enhancement View pull request Use ECS geo.location definition. |
2.11.0 | Enhancement View pull request Move Compute lightweight module config into integration |
2.10.0 | Enhancement View pull request Add GCP PubSub Data stream |
2.9.0 | Enhancement View pull request Add GCP Dataproc Data stream |
2.8.0 | Enhancement View pull request Add GCP GKE Data Stream |
2.7.0 | Enhancement View pull request Add GCP Storage Data Stream |
2.6.0 | Enhancement View pull request Add Load Balancing logs datastream |
2.5.0 | Enhancement View pull request Add GCP Load Balancing Metricset Bug fix View pull request Fix credentials_json escaping in loadbalancing_metrics Bug fix View pull request Update loadbalancing_metrics default period to 60s Bug fix View pull request Fix event.dataset for loadbalancing_metrics Enhancement View pull request Add loadbalancing_metrics distribution fields |
2.4.0 | Enhancement View pull request Update package to ECS 8.4.0 |
2.3.0 | Enhancement View pull request Add additional parsing for DNS Public Zone Query Logs |
2.2.1 | Enhancement View pull request Fix Billing policy template title and default period for gcp.compute |
2.2.0 | Enhancement View pull request Remove fields duplicated in ECS fields |
2.1.0 | Enhancement View pull request restore compatibility with 7.17 release track |
2.0.0 | Breaking change View pull request Move configurations to support metrics. This change is breaking, as it moves |
some configuration from the top level variables to data stream variables. |
This change involves project_id
, credentials_file
and credentials_json
variables that are moved from input level configuration to package level
configuration (as those variables are reused across all inputs/data streams).
Users with GCP integration enabled will need to input values for these
variables again when upgrading the policies to this version.
Enhancement View pull request
Add GCP Billing Data Stream
Enhancement View pull request
Add GCP Compute Data Stream
Enhancement View pull request
Add GCP Firestore Data stream |
| 1.10.0 | Enhancement View pull request
Update package to ECS 8.3.0. |
| 1.9.2 | Bug fix View pull request
Fix GCP auditlog parsing issue on response status |
| 1.9.1 | Enhancement View pull request
Update readme |
| 1.9.0 | Enhancement View pull request
Preserve request and response in flattened fields. |
| 1.8.0 | Enhancement View pull request
Add missing cloud.provider
field. |
| 1.7.0 | Enhancement View pull request
Add dashboards for firewall and vpc flow logs.
Bug fix View pull request
Add missing mappings for several event.*
fields. |
| 1.6.1 | Enhancement View pull request
Clarify the GCP privileges required by the Pub/Sub input. |
| 1.6.0 | Enhancement View pull request
Update to ECS 8.2 |
| 1.5.1 | Enhancement View pull request
Add documentation for multi-fields |
| 1.5.0 | Enhancement View pull request
Improve Google Cloud Platform docs. |
| 1.4.2 | Bug fix View pull request
Remove emtpy values, names with only dots, and invalid client IPs. |
| 1.4.1 | Bug fix View pull request
Fix quoting of the credentials_json value in policy templates. |
| 1.4.0 | Enhancement View pull request
Add gcp.dns integration |
| 1.3.1 | Bug fix View pull request
Add Ingest Pipeline script to map IANA Protocol Numbers |
| 1.3.0 | Enhancement View pull request
Update to ECS 8.0 |
| 1.2.2 | Bug fix View pull request
Regenerate test files using the new GeoIP database |
| 1.2.1 | Bug fix View pull request
Change test public IPs to the supported subset |
| 1.2.0 | Enhancement View pull request
Add 8.0.0 version constraint |
| 1.1.2 | Enhancement View pull request
Update Title and Description. |
| 1.1.1 | Bug fix View pull request
Fix logic that checks for the 'forwarded' tag |
| 1.1.0 | Enhancement View pull request
Update to ECS 1.12.0 |
| 1.0.0 | Enhancement View pull request
Move from experimental to GA
Enhancement View pull request
remove experimental from data_sets |
| 0.3.3 | Enhancement View pull request
Convert to generated ECS fields |
| 0.3.2 | Enhancement View pull request
update to ECS 1.11.0 |
| 0.3.1 | Enhancement View pull request
Escape special characters in docs |
| 0.3.0 | Enhancement View pull request
Update integration description |
| 0.2.0 | Enhancement View pull request
Set "event.module" and "event.dataset" |
| 0.1.0 | Enhancement View pull request
update to ECS 1.10.0 and adding event.original options |
| 0.0.2 | Enhancement View pull request
update to ECS 1.9.0 |
| 0.0.1 | Enhancement View pull request
initial release |