MENU
    Google Cloud Storage
    • 06 Jun 2025
    • 3 Minutes to read
    • Dark

    Google Cloud Storage

    • Dark

    Article summary

    Overview

    This Adapter allows you to ingest files/blobs stored in Google Cloud Storage (GCS).

    Note that this adapter operates as a sink by default, meaning it will "consume" files from the GCS bucket by deleting them once ingested.

    Configurations

    Adapter Type: gcs

    • client_options: common configuration for adapter as defined here.

    • bucket_name: the name of the bucket to ingest from.

    • service_account_creds: the string version of the JSON credentials for a (Google) Service Account to use accessing the bucket.

    • prefix: only ingest files with a given path prefix.

    • single_load: if true, the adapter will not operate as a sink, it will ingest all files in the bucket once and will then exit.

    Infrastructure as Code Deployment

    # Google Cloud Storage (GCS) Specific Docs: https://docs.limacharlie.io/docs/adapter-types-gcs
    
    gcs:
      bucket_name: "your-gcs-bucket-for-limacharlie-logs" # (required) The name of the GCS bucket to ingest from.
      service_account_creds: "/opt/limacharlie_adapter/gcp_creds/gcs_read_service_account.json" # (required) Path to the JSON credentials file for a GCP Service Account, or the JSON string itself. Needs storage.objectViewer permission.
      single_load: false # (optional) If true, ingest all files once then exit. Default is false (continuous, processes new files).
      prefix: "security_logs/firewall/" # (optional) Only ingest files with a given path prefix. Do not include a leading /.
      parallel_fetch: 5 # (optional) Number of files to fetch in parallel. Default is 1.
      client_options:
        identity:
          oid: "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx" # (required) Organization ID from LimaCharlie.
          installation_key: "YOUR_LC_INSTALLATION_KEY_GCS" # (required) Installation key associated with the OID.
        hostname: "gcs-log-adapter-prod-01.example.com" # (required if not using sensor_hostname_path)
        platform: "gcp_gcs" # (required) Indicates the source is Google Cloud Storage.
        architecture: null # (optional) Not typically applicable for this type of adapter.
        mapping:
          # GCS often contains structured logs (JSON lines, CSV) or gzipped text.
          # parsing_re might be used if files are line-delimited text needing regex.
          # For JSON lines, parsing_re would be null.
          parsing_re: null # Example for JSON lines: null. For specific text lines: "^(?P<timestamp>\\S+) (?P<source_ip>\\S+) (?P<message>.*)$"
          # (optional) Using GCS object name or a field within the log data if objects contain multiple events.
          sensor_key_path: "routing.original_file_path" # Example: using the original file path as part of the sensor key.
          # (optional) If client_options.hostname is NOT set, use this to dynamically extract hostname from event data (less common for GCS bucket logs).
          sensor_hostname_path: null
          # (optional) Example based on GCS Audit Log structure or custom log structure.
          event_type_path: "GCS_LOG_{{ .payload.protoPayload.serviceName | token | upper | default \"STORAGE\" }}_{{ .payload.protoPayload.methodName | token | upper | default \"GENERIC\" }}"
          # (optional) JSON path to the event's occurrence time. For GCS Audit Logs, often within protoPayload. For other logs, depends on their format.
          event_time_path: "timestamp" # Or "receiveTimestamp" for GCS Audit Logs, or a field within the custom log content.
          # (optional) JSON path for a field to populate LimaCharlie's investigation_id. For GCS Audit Logs, 'insertId' is often unique.
          investigation_id_path: "payload.insertId"
          # (optional) Use +/- syntax for transforms.
          transform:
            "+cloud_storage_provider": "GoogleCloudStorage"
            "+gcs_bucket_configured": "{{ .config.bucket_name }}" # Accessing adapter config for bucket name
            "+gcs_object_processed": "{{ .routing.original_file_path }}" # If adapter provides this in routing info
            # If logs are GCS Audit Logs:
            "+actor_principal_email": "{{ .payload.protoPayload.authenticationInfo.principalEmail }}"
            "-payload.metadata": null # If GCS Audit Log metadata is too verbose
          # (optional) A list of field paths to drop.
          drop_fields:
          - "payload.protopayload.requestMetadata.callerSuppliedUserAgent" # Example for GCS Audit Logs
          - "internal_processing_markers"
          sid_replication_path: null # (optional)
        # mappings: null
        indexing:
          enabled: true
          default_index: "gcs-logs-{{ .identity.oid | substr 0 8 }}"
        is_compressed: true # (optional) Often true for logs stored in GCS (e.g., .gz, .zip). Adapter usually handles decompression.
        sensor_seed_key: "SEED_KEY_GCS_ADAPTER_001" # (required)
        dest_url: "https://input.limacharlie.io" # (optional) The destination URL. Usually defaults correctly.
    YAML

    API Doc

    See the official documentation.


    Was this article helpful?


    What's Next