Skip to content
This repository was archived by the owner on Apr 4, 2019. It is now read-only.
This repository was archived by the owner on Apr 4, 2019. It is now read-only.

Do I have to set up HDFS in order to use streamX? #60

Description

@iShiBin

I noticed I have to configure the hadoop config files like core-site.xml, hdfs-site.xml to configure S3. And I could not find the mentioned config/hadoop-conf in my installation (Kafka 0.10.2.0). So do I have to use HDFS in order to use this streamX?

What I am trying to do is to transform some messages in JSON format to parquet and then store them in S3.

Using spark could achieve this target but it would require a long-running cluster to do, or I can use the checkpoint to do a per day basic ETL.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions