Skip to content

Conversation

techaddict
Copy link
Contributor

The fix is simple, we just use the full path as in other places where we invoke the shell.

The fix is simple, we just use the full path as in other places where we invoke the shell.
@pwendell
Copy link
Contributor

thanks for this - it's a duplicate of #812 though. I created the JIRA after seeing #812.

@techaddict techaddict closed this May 18, 2014
@techaddict
Copy link
Contributor Author

ok

@techaddict techaddict deleted the SPARK-1869 branch May 18, 2014 20:26
agirish pushed a commit to HPEEzmeral/apache-spark that referenced this pull request May 5, 2022
cloud-fan pushed a commit that referenced this pull request Dec 12, 2022
### What changes were proposed in this pull request?

Remove overriding the description method in the V2 file sources. `FileScan` already uses all the metadata to create the description, so adding the same fields to the overridden description creates duplicates.

### Why are the changes needed?

Example parquet scan from the agg pushdown suite:

Before:
```
+- BatchScan parquet file:/...[min(_3)#814, max(_3)#815, min(_1)#816, max(_1)#817, count(*)#818L, count(_1)#819L, count(_2)#820L, count(_3)#821L] ParquetScan DataFilters: [], Format: parquet, Location: InMemoryFileIndex(1 paths)[file:/..., PartitionFilters: [], PushedAggregation: [MIN(_3), MAX(_3), MIN(_1), MAX(_1), COUNT(*), COUNT(_1), COUNT(_2), COUNT(_3)], PushedFilters: [], PushedGroupBy: [], ReadSchema: struct<min(_3):int,max(_3):int,min(_1):int,max(_1):int,count(*):bigint,count(_1):bigint,count(_2)..., PushedFilters: [], PushedAggregation: [MIN(_3), MAX(_3), MIN(_1), MAX(_1), COUNT(*), COUNT(_1), COUNT(_2), COUNT(_3)], PushedGroupBy: [] RuntimeFilters: []
```

After:
```
 +- BatchScan parquet file:/...[min(_3)#814, max(_3)#815, min(_1)#816, max(_1)#817, count(*)#818L, count(_1)#819L, count(_2)#820L, count(_3)#821L] ParquetScan DataFilters: [], Format: parquet, Location: InMemoryFileIndex(1 paths)[file:/..., PartitionFilters: [], PushedAggregation: [MIN(_3), MAX(_3), MIN(_1), MAX(_1), COUNT(*), COUNT(_1), COUNT(_2), COUNT(_3)], PushedFilters: [], PushedGroupBy: [], ReadSchema: struct<min(_3):int,max(_3):int,min(_1):int,max(_1):int,count(*):bigint,count(_1):bigint,count(_2)... RuntimeFilters: []
```

### Does this PR introduce _any_ user-facing change?

Just description change in explain output.

### How was this patch tested?

Updated a few UTs to accommodate checking explain string.

Closes #38229 from Kimahriman/remove-file-source-description.

Authored-by: Adam Binford <[email protected]>
Signed-off-by: Wenchen Fan <[email protected]>
beliefer pushed a commit to beliefer/spark that referenced this pull request Dec 18, 2022
### What changes were proposed in this pull request?

Remove overriding the description method in the V2 file sources. `FileScan` already uses all the metadata to create the description, so adding the same fields to the overridden description creates duplicates.

### Why are the changes needed?

Example parquet scan from the agg pushdown suite:

Before:
```
+- BatchScan parquet file:/...[min(_3)apache#814, max(_3)apache#815, min(_1)apache#816, max(_1)apache#817, count(*)#818L, count(_1)#819L, count(_2)#820L, count(_3)#821L] ParquetScan DataFilters: [], Format: parquet, Location: InMemoryFileIndex(1 paths)[file:/..., PartitionFilters: [], PushedAggregation: [MIN(_3), MAX(_3), MIN(_1), MAX(_1), COUNT(*), COUNT(_1), COUNT(_2), COUNT(_3)], PushedFilters: [], PushedGroupBy: [], ReadSchema: struct<min(_3):int,max(_3):int,min(_1):int,max(_1):int,count(*):bigint,count(_1):bigint,count(_2)..., PushedFilters: [], PushedAggregation: [MIN(_3), MAX(_3), MIN(_1), MAX(_1), COUNT(*), COUNT(_1), COUNT(_2), COUNT(_3)], PushedGroupBy: [] RuntimeFilters: []
```

After:
```
 +- BatchScan parquet file:/...[min(_3)apache#814, max(_3)apache#815, min(_1)apache#816, max(_1)apache#817, count(*)#818L, count(_1)#819L, count(_2)#820L, count(_3)#821L] ParquetScan DataFilters: [], Format: parquet, Location: InMemoryFileIndex(1 paths)[file:/..., PartitionFilters: [], PushedAggregation: [MIN(_3), MAX(_3), MIN(_1), MAX(_1), COUNT(*), COUNT(_1), COUNT(_2), COUNT(_3)], PushedFilters: [], PushedGroupBy: [], ReadSchema: struct<min(_3):int,max(_3):int,min(_1):int,max(_1):int,count(*):bigint,count(_1):bigint,count(_2)... RuntimeFilters: []
```

### Does this PR introduce _any_ user-facing change?

Just description change in explain output.

### How was this patch tested?

Updated a few UTs to accommodate checking explain string.

Closes apache#38229 from Kimahriman/remove-file-source-description.

Authored-by: Adam Binford <[email protected]>
Signed-off-by: Wenchen Fan <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants