Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for catalog filtering in DBX adapter #941

Open
1 of 5 tasks
Dynosphere opened this issue Aug 21, 2024 · 0 comments
Open
1 of 5 tasks

Add support for catalog filtering in DBX adapter #941

Dynosphere opened this issue Aug 21, 2024 · 0 comments
Labels
bug Something isn't working triage

Comments

@Dynosphere
Copy link

Describe the bug

get_tables_by_pattern_sql() macro does not properly support databricks.
Databricks has table_catalog as a field in the information schema for tables. The macro does not respect the provided database in the where clause, and will if you are trying to use 'system' as the database it will instead return tables from all catalogs that match the schema/table patterns.

Steps to reproduce

run any macro that calls get_tables_by_pattern_sql() against databricks using 'system' as database, 'billing' as schema. If another schema called billing exists, it will pull through all tables from all catalogs.

Expected results

Respect the filter against database and only return tables within a specific catalog.

Actual results

Currently, see steps to reproduce. returns all tables across all catalogs if multiple schemas exist with the same name.

Screenshots and log output

System information

packages:

  • package: dbt-labs/codegen
    version: 0.12.1

Which database are you using dbt with?

  • postgres
  • redshift
  • bigquery
  • snowflake
  • other (specify: Databricks)

The output of dbt --version:

Core:
  - installed: 1.8.5
  - latest:    1.8.5 - Up to date!

Plugins:
  - databricks: 1.8.5 - Up to date!
  - spark:      1.8.0 - Up to date!

Additional context

Are you interested in contributing the fix?

Sure, the fix is to add a databricks specific version of the macro into the get_tables_by_pattern_sql file which supports table_catalog:

{% macro databricks__get_tables_by_pattern_sql(schema_pattern, table_pattern, exclude='', database=target.database) %}

        select distinct
            table_schema as {{ adapter.quote('table_schema') }},
            table_name as {{ adapter.quote('table_name') }},
            {{ dbt_utils.get_table_types_sql() }}
        from {{ database }}.information_schema.tables
        where table_catalog ilike '{{database}}'
        and table_schema ilike '{{ schema_pattern }}'
        and table_name ilike '{{ table_pattern }}'
        and table_name not ilike '{{ exclude }}'

{% endmacro %}
@Dynosphere Dynosphere added bug Something isn't working triage labels Aug 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working triage
Projects
None yet
Development

No branches or pull requests

1 participant