Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions presto-docs/src/main/sphinx/connector/deltalake.rst
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,9 @@ Property Name Description
metastore to find the location of Delta Lake tables.
From the Delta Log at given location, schema and data
file list of the table is found.

``hive.metastore.catalog.name`` Specifies the catalog name to be passed to the metastore.

``delta.parquet-dereference-pushdown-enabled`` Enable pushing nested column dereferences into ``true``
table scan so that only the required fields
selected in a ``struct`` data type column are selected.
Expand Down
2 changes: 2 additions & 0 deletions presto-docs/src/main/sphinx/connector/hive.rst
Original file line number Diff line number Diff line change
Expand Up @@ -206,6 +206,8 @@ Property Name Description
error iterating through empty files.

``hive.file-status-cache.max-retained-size`` Maximum size in bytes of the directory listing cache ``0KB``

``hive.metastore.catalog.name`` Specifies the catalog name to be passed to the metastore.
================================================== ============================================================ ============

Metastore Configuration Properties
Expand Down
2 changes: 2 additions & 0 deletions presto-docs/src/main/sphinx/connector/hudi.rst
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,8 @@ Property Name Description
======================================= ============================================= ===========
``hudi.metadata-table-enabled`` Fetch the list of file names and sizes from false
Hudi's metadata table rather than storage.
``hive.metastore.catalog.name`` Specifies the catalog name to be passed to
the metastore.
======================================= ============================================= ===========

File-Based Metastore
Expand Down
2 changes: 2 additions & 0 deletions presto-docs/src/main/sphinx/connector/iceberg.rst
Original file line number Diff line number Diff line change
Expand Up @@ -69,6 +69,8 @@ Property Name Description
``iceberg.catalog.type`` is ``hive`` and ``hive.metastore``
is ``thrift``.

``hive.metastore.catalog.name`` Specifies the catalog name to be passed to the metastore.

``iceberg.hive-statistics-merge-strategy`` Comma separated list of statistics to use from the
Hive Metastore to override Iceberg table statistics.
The available values are ``NUMBER_OF_DISTINCT_VALUES``
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -46,6 +46,7 @@ public class HiveCommonClientConfig
private boolean readNullMaskedParquetEncryptedValueEnabled;
private boolean useParquetColumnNames;
private boolean zstdJniDecompressionEnabled;
private String catalogName;

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like a sane default would be the name of the presto catalog. Would it be possible to set that here?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we set a default value here, Presto will start sending the catalog name even when the configuration is not explicitly defined in the properties. This could cause issues with different metastores that do not support catalog names as input. The intention was to keep this configurable and only pass it when the metastore can handle it.

Could you share the advantages you see in adding a default value here?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The advantage is that users would not have to specify it in the first place when using HMS. Consider two hive catalog configured through hive1.properties and hive2.properties. If they were configured to use the same HMS, then inside the config files then in the catalog configuration you would need to set this property as hive.metastore.catalog.name=hive1, etc. If presto already has the catalog name available, it would seem to make sense to default the catalog name to the configured presto catalog name

I do however understand the desire for passing the catalog name to be optional for the case where a metastore doesn't support the "catalog" feature. I'm not sure what the best course of action is other than adding a boolean flag in addition to this one to support both having a sane default and verifying that the feature is enabled

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Won't using a boolean hive.metastore.send-catalog-name=true in the core.config suffice?

private DataSize affinitySchedulingFileSectionSize = new DataSize(256, MEGABYTE);

public NodeSelectionStrategy getNodeSelectionStrategy()
Expand Down Expand Up @@ -286,6 +287,19 @@ public HiveCommonClientConfig setZstdJniDecompressionEnabled(boolean zstdJniDeco
return this;
}

public String getCatalogName()
{
return catalogName;
}

@Config("hive.metastore.catalog.name")
@ConfigDescription("Specified property to store the metastore catalog name.")
public HiveCommonClientConfig setCatalogName(String catalogName)
{
this.catalogName = catalogName;
return this;
}

@NotNull
public DataSize getAffinitySchedulingFileSectionSize()
{
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,8 @@
import com.google.common.collect.ImmutableList;
import com.google.common.collect.Iterables;

import javax.annotation.Nullable;

import java.util.HashSet;
import java.util.List;
import java.util.Map;
Expand All @@ -49,6 +51,10 @@

public final class MetadataUtils
{
private static final String CATALOG_DB_SEPARATOR = "#";
private static final String CATALOG_DB_THRIFT_NAME_MARKER = "@";
private static final String DB_EMPTY_MARKER = "!";
private static final String DEFAULT_DATABASE = "default";
private MetadataUtils() {}

public static Optional<DiscretePredicates> getDiscretePredicates(List<ColumnHandle> partitionColumns, List<HivePartition> partitions)
Expand Down Expand Up @@ -160,4 +166,26 @@ private static Domain buildColumnDomain(ColumnHandle column, List<HivePartition>

return Domain.onlyNull(type);
}

/**
* Constructs the schema name, including catalog name if applicable.
*
* @param schemaName the original schema name
* @return the formatted schema name (Example - @catalog_name#schema_name)
*/
public static String constructSchemaName(Optional<String> catalogName, @Nullable String schemaName)
{
if (!catalogName.isPresent() || DEFAULT_DATABASE.equals(schemaName) ||
(schemaName != null && schemaName.contains(CATALOG_DB_SEPARATOR))) {
return schemaName;
}

StringBuilder catalogDatabaseName = new StringBuilder()
.append(CATALOG_DB_THRIFT_NAME_MARKER)
.append(catalogName.get()) // Safe since we checked isPresent()
.append(CATALOG_DB_SEPARATOR)
.append(schemaName == null ? "" : schemaName.isEmpty() ? DB_EMPTY_MARKER : schemaName);

return catalogDatabaseName.toString();
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -49,6 +49,7 @@ public void testDefaults()
.setParquetBatchReaderVerificationEnabled(false)
.setParquetBatchReadOptimizationEnabled(false)
.setReadNullMaskedParquetEncryptedValue(false)
.setCatalogName(null)
.setAffinitySchedulingFileSectionSize(new DataSize(256, MEGABYTE)));
}

Expand All @@ -74,6 +75,7 @@ public void testExplicitPropertyMappings()
.put("hive.enable-parquet-batch-reader-verification", "true")
.put("hive.parquet-batch-read-optimization-enabled", "true")
.put("hive.read-null-masked-parquet-encrypted-value-enabled", "true")
.put("hive.metastore.catalog.name", "catalogName")
.put("hive.affinity-scheduling-file-section-size", "512MB")
.build();

Expand All @@ -96,6 +98,7 @@ public void testExplicitPropertyMappings()
.setParquetBatchReaderVerificationEnabled(true)
.setParquetBatchReadOptimizationEnabled(true)
.setReadNullMaskedParquetEncryptedValue(true)
.setCatalogName("catalogName")
.setAffinitySchedulingFileSectionSize(new DataSize(512, MEGABYTE));

ConfigAssertions.assertFullMapping(properties, expected);
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,74 @@
/*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
package com.facebook.presto.hive;

import org.testng.annotations.Test;

import java.util.Optional;

import static com.facebook.presto.hive.MetadataUtils.constructSchemaName;
import static org.testng.Assert.assertNull;
import static org.testng.Assert.assertTrue;

public class TestMetadataUtils
{
@Test
public void testWithCatalogAndValidSchema()
{
String result = constructSchemaName(Optional.of("testCatalog"), "testSchema");
assertTrue(result.equals("@testCatalog#testSchema"));
}

@Test
public void testWithCatalogAndDefaultSchema()
{
String result = constructSchemaName(Optional.of("testCatalog"), "default");
assertTrue(result.equals("default"));
}

@Test
public void testWithCatalogAndSchemaContainingSeparator()
{
String result = constructSchemaName(Optional.of("testCatalog"), "schema#with#dot");
assertTrue(result.equals("schema#with#dot"));
}

@Test
public void testWithoutCatalog()
{
String result = constructSchemaName(Optional.empty(), "testSchema");
assertTrue(result.equals("testSchema"));
}

@Test
public void testWithNullSchema()
{
String result = constructSchemaName(Optional.empty(), null);
Comment thread
AnuragKDwivedi marked this conversation as resolved.
assertNull(result);
}

@Test
public void testWithoutCatalogNameAndEmptySchemaName()
{
String result = constructSchemaName(Optional.empty(), "");
assertTrue(result.isEmpty());
}

@Test
public void testWithCatalogNameAndEmptySchemaName()
{
String result = constructSchemaName(Optional.of("testCatalog"), "");
assertTrue(result.equals("@testCatalog#!"));
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,7 @@
import com.facebook.presto.hive.HiveCoercionPolicy;
import com.facebook.presto.hive.HiveColumnConverterProvider;
import com.facebook.presto.hive.HiveColumnHandle;
import com.facebook.presto.hive.HiveCommonClientConfig;
import com.facebook.presto.hive.HiveEncryptionInformationProvider;
import com.facebook.presto.hive.HiveFileRenamer;
import com.facebook.presto.hive.HiveHdfsConfiguration;
Expand Down Expand Up @@ -137,7 +138,7 @@ public S3SelectTestHelper(String host,
metastoreClientConfig.setMetastoreSocksProxy(HostAndPort.fromString(proxy));
}

HiveCluster hiveCluster = new TestingHiveCluster(metastoreClientConfig, thriftHiveMetastoreConfig, host, port);
HiveCluster hiveCluster = new TestingHiveCluster(metastoreClientConfig, thriftHiveMetastoreConfig, host, port, new HiveCommonClientConfig());
executor = newCachedThreadPool(daemonThreadsNamed("hive-%s"));
HivePartitionManager hivePartitionManager = new HivePartitionManager(FUNCTION_AND_TYPE_MANAGER, config);

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,7 @@ public class Database
private final PrincipalType ownerType;
private final Optional<String> comment;
private final Map<String, String> parameters;
private final Optional<String> catalogName;

@JsonCreator
public Database(
Expand All @@ -47,14 +48,16 @@ public Database(
@JsonProperty("ownerName") String ownerName,
@JsonProperty("ownerType") PrincipalType ownerType,
@JsonProperty("comment") Optional<String> comment,
@JsonProperty("parameters") Map<String, String> parameters)
@JsonProperty("parameters") Map<String, String> parameters,
@JsonProperty("catalogName") Optional<String> catalogName)
{
this.databaseName = requireNonNull(databaseName, "databaseName is null");
this.location = requireNonNull(location, "location is null");
this.ownerName = requireNonNull(ownerName, "ownerName is null");
this.ownerType = requireNonNull(ownerType, "ownerType is null");
this.comment = requireNonNull(comment, "comment is null");
this.parameters = ImmutableMap.copyOf(requireNonNull(parameters, "parameters is null"));
this.catalogName = requireNonNull(catalogName, "catalogName is null");
}

@JsonProperty
Expand Down Expand Up @@ -103,6 +106,12 @@ public static Builder builder(Database database)
return new Builder(database);
}

@JsonProperty
public Optional<String> getCatalogName()
{
return catalogName;
}

public static class Builder
{
private String databaseName;
Expand All @@ -111,6 +120,7 @@ public static class Builder
private PrincipalType ownerType;
private Optional<String> comment = Optional.empty();
private Map<String, String> parameters = new LinkedHashMap<>();
private Optional<String> catalogName = Optional.empty();

public Builder() {}

Expand All @@ -122,6 +132,7 @@ public Builder(Database database)
this.ownerType = database.ownerType;
this.comment = database.comment;
this.parameters = database.parameters;
this.catalogName = database.catalogName;
}

public Builder setDatabaseName(String databaseName)
Expand Down Expand Up @@ -166,6 +177,12 @@ public Builder setParameters(Map<String, String> parameters)
return this;
}

public Builder setCatalogName(Optional<String> catalogName)
{
this.catalogName = requireNonNull(catalogName, "catalogName is null");
return this;
}

public Database build()
{
return new Database(
Expand All @@ -174,7 +191,8 @@ public Database build()
ownerName,
ownerType,
comment,
parameters);
parameters,
catalogName);
}
}

Expand All @@ -188,6 +206,7 @@ public String toString()
.add("ownerType", ownerType)
.add("comment", comment)
.add("parameters", parameters)
.add("catalogName", catalogName)
.toString();
}

Expand All @@ -207,12 +226,13 @@ public boolean equals(Object o)
Objects.equals(ownerName, database.ownerName) &&
ownerType == database.ownerType &&
Objects.equals(comment, database.comment) &&
Objects.equals(parameters, database.parameters);
Objects.equals(parameters, database.parameters) &&
Objects.equals(catalogName, database.catalogName);
}

@Override
public int hashCode()
{
return Objects.hash(databaseName, location, ownerName, ownerType, comment, parameters);
return Objects.hash(databaseName, location, ownerName, ownerType, comment, parameters, catalogName);
}
}
Loading