hadoop-common is not an optional dependency

parquet-hadoop provides the only mechanism to load .parquet files and has an optional (provided) dependency on hadoop-common, implying that it is possible to use parquet-hadoop without using hadoop. However, it is required.

The following code is needed to instantiate a ParquetFileReader

```Java

final class LocalInputFile(file: File) extends InputFile {
  def getLength() = file.length()
  def newStream(): SeekableInputStream = {
    val input = new FileInputStream(file)
    new DelegatingSeekableInputStream(input) {
      def getPos(): Long = input.getChannel().position()
      def seek(bs: Long): Unit = {
        val _ = input.getChannel().position(bs)
      }
    }
  }
}
```

but using this leads to a runtime exception because hadoop is missing transitive dependency on org.apache.hadoop.fs.PathFilter which then depends on org.apache.hadoop.fs.Path, both in hadoop-common.

Requiring downstream users to depend on hadoop-common is an extremely large dependency and I would rather that this was not the case.

A search for "import org.apache.hadoop" in src/main reveals a few more places where the dependency is hardwired, although often in deprecated static constructors and therefore benign.

**Reporter**: [Sam Halliday](https://issues.apache.org/jira/secure/ViewProfile.jspa?name=fommil)
#### Related issues:
- [make it easy to read and write parquet files in java without depending on hadoop](https://github.com/apache/parquet-java/issues/1497) (duplicates)

<sub>**Note**: *This issue was originally created as [PARQUET-1953](https://issues.apache.org/jira/browse/PARQUET-1953). Please see the [migration documentation](https://issues.apache.org/jira/browse/PARQUET-2502) for further details.*</sub>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

hadoop-common is not an optional dependency #2556

Related issues:

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

hadoop-common is not an optional dependency #2556

Description

Related issues:

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions