|
| 1 | +--- |
| 2 | +title: Backend Table Hierarchy |
| 3 | +--- |
| 4 | + |
| 5 | +Most SQL backends organize tables into groups, and some use a two-level hierarchy. |
| 6 | +They use terms such as `catalog`, `database`, and/or `schema` to refer to these groups. |
| 7 | + |
| 8 | +Ibis uses the following terminology throughout its codebase, API, and documentation: |
| 9 | + |
| 10 | +- `database`: a collection of tables |
| 11 | +- `catalog`: a collection of databases |
| 12 | + |
| 13 | +In other words, the full specification of a table in Ibis is either |
| 14 | +- `catalog.database.table` |
| 15 | +- `database.table` |
| 16 | + |
| 17 | +For example, to access the `t` table in the `d` database in the `c` catalog, |
| 18 | +you would do `conn.table("t", database=("c", "d"))`. |
| 19 | +See the [Backend.table() documentation](backends/duckdb.qmd#ibis.backends.duckdb.Backend.table) |
| 20 | +for more details. |
| 21 | + |
| 22 | +We use this common terminology in the API of every Backend, **once constructed**. |
| 23 | +However, when you initially **create** a Backend, you will use the |
| 24 | +backend-specific terminology. We made this design decision so that |
| 25 | + |
| 26 | +- You can use the same API for constructing a Backend as for constructing the Backend's |
| 27 | + native connection. |
| 28 | +- But, once you have the Backend, you can use the ibis common terminology |
| 29 | + no matter which Backend you are using, |
| 30 | + which makes it so that if you want to switch from one Backend to another, |
| 31 | + you only have to change your code that **creates** the connection, |
| 32 | + not the code that **uses** the connection. |
| 33 | + |
| 34 | +For example, when connecting to a PostgreSQL database using the native |
| 35 | +`psycopg` driver, you would use the following code: |
| 36 | + |
| 37 | +```python |
| 38 | +psycopg.connect( |
| 39 | + user="me", |
| 40 | + password="supersecret", |
| 41 | + host="abc.com", |
| 42 | + port=5432, |
| 43 | + dbname="my_database", |
| 44 | + options="-csearch_path=my_schema", |
| 45 | +) |
| 46 | +``` |
| 47 | + |
| 48 | +In ibis, you would use the following code (note how it is analogous to the above) |
| 49 | + |
| 50 | +```python |
| 51 | +conn = ibis.postgres.connect( |
| 52 | + user="me", |
| 53 | + password="supersecret", |
| 54 | + host="abc.com", |
| 55 | + port=5432, |
| 56 | + database="my_database", |
| 57 | + schema="my_schema", |
| 58 | +) |
| 59 | +``` |
| 60 | + |
| 61 | +AFTER you have constructed the Backend however, now use the common terminology: |
| 62 | + |
| 63 | +```python |
| 64 | +conn.table("my_table", database=("my_database", "my_schema")) |
| 65 | +conn.list_catalogs() # results in something like ["my_database"] |
| 66 | +conn.list_databases() # results in ["my_schema"] |
| 67 | +``` |
| 68 | + |
| 69 | +Below is a table with the terminology used by each backend for the two levels of |
| 70 | +hierarchy. This is provided as a reference, note that when using Ibis, we will |
| 71 | +use the terms `catalog` and `database` and map them onto the appropriate fields. |
| 72 | + |
| 73 | + |
| 74 | +| Backend | Catalog | Database | |
| 75 | +|------------|----------------|------------| |
| 76 | +| bigquery | project | database | |
| 77 | +| clickhouse | | database | |
| 78 | +| datafusion | catalog | schema | |
| 79 | +| druid | dataSourceType | dataSource | |
| 80 | +| duckdb | database | schema | |
| 81 | +| flink | catalog | database | |
| 82 | +| impala | | database | |
| 83 | +| mssql | database | schema | |
| 84 | +| mysql | | database | |
| 85 | +| oracle | | database | |
| 86 | +| pandas | | NA | |
| 87 | +| polars | | NA | |
| 88 | +| postgres | database | schema | |
| 89 | +| pyspark | database | schema | |
| 90 | +| risingwave | database | schema | |
| 91 | +| sqlite | | schema | |
| 92 | +| snowflake | database | schema | |
| 93 | +| trino | catalog | schema | |
0 commit comments