Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] VPC Flow Log Integration Issue with Glue Catalog Database Name #1041

Open
rafael-gumiero opened this issue Feb 10, 2025 · 1 comment
Open
Labels
enhancement New feature or request

Comments

@rafael-gumiero
Copy link

VPC Flow Log Integration Issue with Glue Catalog Database Name

Environment

  • OpenSearch Version: 2.17
  • Integration Type: VPC Flow Log
  • External Source: Amazon S3
  • AWS Services: AWS Glue Data Catalog
  • Catalog Name: test_zero_etl_aos
  • DataBase Name: db_test_zero_etl_aos
  • Table Name: table_test_zero_etl_aos_flow

Description

When attempting to configure VPC Flow Log integration with OpenSearch using AWS Glue Data Catalog as the metadata store, the system fails to properly handle the database name configuration. The integration expects the default database name in the Glue Catalog, but fails even when this is explicitly specified.

Current Behavior

  • Integration setup fails when attempting to configure VPC Flow Log integration
  • Error occurs even when explicitly specifying the expected default database name
  • Integration does not properly recognize the Glue Catalog database configuration

Steps to Reproduce

  1. Configure Amazon S3 as external source in OpenSearch 2.17
  2. Attempt to set up VPC Flow Log integration
  3. Specify different database name in the Glue Catalog configuration
  4. Execute the integration setup

Error Messages

Image

Image

Table Sample

CREATE TABLE 
test_zero_etl_aos.db_test_zero_etl_aos.table_test_zero_etl_aos_flow (version INT, account_id STRING, interface_id STRING, 
srcaddr STRING, dstaddr STRING, srcport INT, dstport INT, protocol INT, packets BIGINT, 
bytes BIGINT, start BIGINT, end BIGINT, action STRING, log_status STRING, 
`aws-account-id` STRING, `aws-service` STRING, `aws-region` STRING, year STRING, 
month STRING, day STRING, hour STRING) 

USING parquet PARTITIONED BY (`aws-account-id`, `aws-service`, `aws-region`, year, month, 
day, hour) 

LOCATION "s3://xxxxx/flow/AWSLogs/"
```
@rafael-gumiero rafael-gumiero added bug Something isn't working untriaged labels Feb 10, 2025
@noCharger
Copy link
Collaborator

@rafael-gumiero I would call it a feature gap rather than a bug. Based on the source code it will only take the table name rather than database name.

@noCharger noCharger added enhancement New feature or request and removed bug Something isn't working labels Feb 13, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants