[GitHub] incubator-nifi pull request: NIFI-238: Add Kite processors.

classic Classic list List threaded Threaded
35 messages Options
12
Reply | Threaded
Open this post in threaded view
|

[GitHub] incubator-nifi pull request: NIFI-238: Add Kite processors.

JPercivall
GitHub user rdblue opened a pull request:

    https://github.com/apache/incubator-nifi/pull/24

    NIFI-238: Add Kite processors.

    Includes:
    * KiteStorageProcessor - store Avro files in a Kite dataset
    * KiteCSVToAvroProcessor - convert CSV to Avro for storage
    * KiteJSONToAvroProcessor - convert JSON to Avro for storage

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/rdblue/incubator-nifi NIFI-238-kite-processors

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/incubator-nifi/pull/24.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #24
   
----
commit b96a1b8c14deba10c36f6092888eb7396d0b652b
Author: Ryan Blue <[hidden email]>
Date:   2015-02-13T02:53:49Z

    NIFI-238: Add Kite processors.
   
    Includes:
    * KiteStorageProcessor - store Avro files in a Kite dataset
    * KiteCSVToAvroProcessor - convert CSV to Avro for storage
    * KiteJSONToAvroProcessor - convert JSON to Avro for storage

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

Re: [GitHub] incubator-nifi pull request: NIFI-238: Add Kite processors.

Mark Payne
Ryan,




I have a few notes/suggestions below. Overall, it looks great - thanks for the work you’ve put into this!




Any reason for the naming? I’d suggest “ConvertCSVToAvro”, “ConvertJSONToAvro” and “PutKite” or something like that. Specifically, we try to avoid having the word “Processor” in the names of the Processors and name them as verbs. This is done specifically because it helps the user understand what is going on when looking at the graph if the processors are named as verbs that describe what they do. (GetFile -> ConvertJSONToAvro -> PutKite is easier to understand than GetFile -> KiteJSONToAvroProcessor -> KiteStorageProcessor)



It looks like you have a couple of places that you allow a comma-separated list of filenames. After splitting, should probably call “trim” so that users can add spaces or new-lines between the items.



You use a “@VisibleForTesting” annotation on your PropertyDescriptors and make them package-level. Generally, we make all PropertyDescriptors public.




In the CSV to Avro processor, you have a member variable “CSVProperties props;”. That is initialized in the @OnScheduled, so it needs to be marked volatile so that threads in the onTrigger method have appropriate access to it.




I would avoid catching FlowFileAccessException or Throwable in the processors - generally, the framework is best at handling these.




I don’t know if this matters because of the readers/writers you’re using but just to be sure: The InputStream and OutputStream that the framework gives to you when you call session.read / session.write are not buffered. So if the readers or writers don’t buffer data themselves performance could hurt. It may help to wrap those streams in BufferedInputStream, BufferedOutputStream?



A couple of the files have Cloudera Copyright notices but most do not.



Thanks

-Mark








From: rdblue
Sent: ‎Sunday‎, ‎February‎ ‎15‎, ‎2015 ‎5‎:‎57‎ ‎AM
To: [hidden email]





GitHub user rdblue opened a pull request:

    https://github.com/apache/incubator-nifi/pull/24

    NIFI-238: Add Kite processors.

    Includes:
    * KiteStorageProcessor - store Avro files in a Kite dataset
    * KiteCSVToAvroProcessor - convert CSV to Avro for storage
    * KiteJSONToAvroProcessor - convert JSON to Avro for storage

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/rdblue/incubator-nifi NIFI-238-kite-processors

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/incubator-nifi/pull/24.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #24
   
----
commit b96a1b8c14deba10c36f6092888eb7396d0b652b
Author: Ryan Blue <[hidden email]>
Date:   2015-02-13T02:53:49Z

    NIFI-238: Add Kite processors.
   
    Includes:
    * KiteStorageProcessor - store Avro files in a Kite dataset
    * KiteCSVToAvroProcessor - convert CSV to Avro for storage
    * KiteJSONToAvroProcessor - convert JSON to Avro for storage

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] incubator-nifi pull request: NIFI-238: Add Kite processors.

JPercivall
In reply to this post by JPercivall
Github user busbey commented on a diff in the pull request:

    https://github.com/apache/incubator-nifi/pull/24#discussion_r24778895
 
    --- Diff: nifi/nifi-nar-bundles/nifi-kite-bundle/pom.xml ---
    @@ -0,0 +1,60 @@
    +<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
    +  <!--
    +    Licensed to the Apache Software Foundation (ASF) under one or more
    +    contributor license agreements.  See the NOTICE file distributed with
    +    this work for additional information regarding copyright ownership.
    +    The ASF licenses this file to You under the Apache License, Version 2.0
    +    (the "License"); you may not use this file except in compliance with
    +    the License.  You may obtain a copy of the License at
    +        http://www.apache.org/licenses/LICENSE-2.0
    +    Unless required by applicable law or agreed to in writing, software
    +    distributed under the License is distributed on an "AS IS" BASIS,
    +    WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    +    See the License for the specific language governing permissions and
    +    limitations under the License.
    +  -->
    +  <modelVersion>4.0.0</modelVersion>
    +
    +  <parent>
    +    <groupId>org.apache.nifi</groupId>
    +    <artifactId>nifi-nar-bundles</artifactId>
    +    <version>0.0.1-incubating-SNAPSHOT</version>
    --- End diff --
   
    version should be 0.0.2-incubating-SNAPSHOT now.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] incubator-nifi pull request: NIFI-238: Add Kite processors.

JPercivall
In reply to this post by JPercivall
Github user busbey commented on a diff in the pull request:

    https://github.com/apache/incubator-nifi/pull/24#discussion_r24778929
 
    --- Diff: nifi/nifi-nar-bundles/nifi-kite-bundle/pom.xml ---
    @@ -0,0 +1,60 @@
    +<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
    +  <!--
    +    Licensed to the Apache Software Foundation (ASF) under one or more
    +    contributor license agreements.  See the NOTICE file distributed with
    +    this work for additional information regarding copyright ownership.
    +    The ASF licenses this file to You under the Apache License, Version 2.0
    +    (the "License"); you may not use this file except in compliance with
    +    the License.  You may obtain a copy of the License at
    +        http://www.apache.org/licenses/LICENSE-2.0
    +    Unless required by applicable law or agreed to in writing, software
    +    distributed under the License is distributed on an "AS IS" BASIS,
    +    WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    +    See the License for the specific language governing permissions and
    +    limitations under the License.
    +  -->
    +  <modelVersion>4.0.0</modelVersion>
    +
    +  <parent>
    +    <groupId>org.apache.nifi</groupId>
    +    <artifactId>nifi-nar-bundles</artifactId>
    +    <version>0.0.1-incubating-SNAPSHOT</version>
    +  </parent>
    +
    +  <artifactId>nifi-kite-bundle</artifactId>
    +  <version>0.0.1-incubating-SNAPSHOT</version>
    --- End diff --
   
    apologies if this is a NiFi-ism, but shouldn't this version just come from the parent pom?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] incubator-nifi pull request: NIFI-238: Add Kite processors.

JPercivall
In reply to this post by JPercivall
Github user busbey commented on a diff in the pull request:

    https://github.com/apache/incubator-nifi/pull/24#discussion_r24779099
 
    --- Diff: nifi/nifi-nar-bundles/nifi-kite-bundle/pom.xml ---
    @@ -0,0 +1,60 @@
    +<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
    +  <!--
    +    Licensed to the Apache Software Foundation (ASF) under one or more
    +    contributor license agreements.  See the NOTICE file distributed with
    +    this work for additional information regarding copyright ownership.
    +    The ASF licenses this file to You under the Apache License, Version 2.0
    +    (the "License"); you may not use this file except in compliance with
    +    the License.  You may obtain a copy of the License at
    +        http://www.apache.org/licenses/LICENSE-2.0
    +    Unless required by applicable law or agreed to in writing, software
    +    distributed under the License is distributed on an "AS IS" BASIS,
    +    WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    +    See the License for the specific language governing permissions and
    +    limitations under the License.
    +  -->
    +  <modelVersion>4.0.0</modelVersion>
    +
    +  <parent>
    +    <groupId>org.apache.nifi</groupId>
    +    <artifactId>nifi-nar-bundles</artifactId>
    +    <version>0.0.1-incubating-SNAPSHOT</version>
    +  </parent>
    +
    +  <artifactId>nifi-kite-bundle</artifactId>
    +  <version>0.0.1-incubating-SNAPSHOT</version>
    +  <packaging>pom</packaging>
    +
    +  <name>Kite Bundle</name>
    +  <description>A bundle of processors that use Kite to store data in Hadoop</description>
    +
    +  <modules>
    +    <module>nifi-kite-processors</module>
    +    <module>nifi-kite-nar</module>
    +  </modules>
    +  
    +  <build>
    +    <pluginManagement>
    +      <plugins>
    +        <plugin>
    +          <groupId>org.apache.maven.plugins</groupId>
    +          <artifactId>maven-surefire-plugin</artifactId>
    +          <configuration>
    +            <redirectTestOutputToFile>true</redirectTestOutputToFile>
    +          </configuration>
    +        </plugin>
    +      </plugins>
    +    </pluginManagement>
    +  </build>
    +
    +  <dependencyManagement>
    +    <dependencies>
    +      <dependency>
    +        <groupId>org.apache.nifi</groupId>
    +        <artifactId>nifi-kite-processors</artifactId>
    +        <version>0.0.1-incubating-SNAPSHOT</version>
    --- End diff --
   
    this should probably be a reference to project.version or some other variable.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] incubator-nifi pull request: NIFI-238: Add Kite processors.

JPercivall
In reply to this post by JPercivall
Github user busbey commented on a diff in the pull request:

    https://github.com/apache/incubator-nifi/pull/24#discussion_r24779130
 
    --- Diff: nifi/nifi-nar-bundles/nifi-kite-bundle/nifi-kite-processors/src/test/java/org/apache/nifi/processors/kite/TestUtil.java ---
    @@ -0,0 +1,84 @@
    +package org.apache.nifi.processors.kite;
    --- End diff --
   
    needs a file-level license


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] incubator-nifi pull request: NIFI-238: Add Kite processors.

JPercivall
In reply to this post by JPercivall
Github user busbey commented on a diff in the pull request:

    https://github.com/apache/incubator-nifi/pull/24#discussion_r24779145
 
    --- Diff: nifi/nifi-nar-bundles/nifi-kite-bundle/nifi-kite-processors/src/test/java/org/apache/nifi/processors/kite/TestKiteStorageProcessor.java ---
    @@ -0,0 +1,148 @@
    +package org.apache.nifi.processors.kite;
    +
    --- End diff --
   
    needs a file level license


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] incubator-nifi pull request: NIFI-238: Add Kite processors.

JPercivall
In reply to this post by JPercivall
Github user busbey commented on a diff in the pull request:

    https://github.com/apache/incubator-nifi/pull/24#discussion_r24779153
 
    --- Diff: nifi/nifi-nar-bundles/nifi-kite-bundle/nifi-kite-processors/src/test/java/org/apache/nifi/processors/kite/TestKiteProcessorsCluster.java ---
    @@ -0,0 +1,119 @@
    +package org.apache.nifi.processors.kite;
    +
    --- End diff --
   
    needs a file level license


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] incubator-nifi pull request: NIFI-238: Add Kite processors.

JPercivall
In reply to this post by JPercivall
Github user busbey commented on a diff in the pull request:

    https://github.com/apache/incubator-nifi/pull/24#discussion_r24779170
 
    --- Diff: nifi/nifi-nar-bundles/nifi-kite-bundle/nifi-kite-processors/src/test/java/org/apache/nifi/processors/kite/TestJSONToAvroProcessor.java ---
    @@ -0,0 +1,42 @@
    +package org.apache.nifi.processors.kite;
    +
    --- End diff --
   
    needs a file level license


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] incubator-nifi pull request: NIFI-238: Add Kite processors.

JPercivall
In reply to this post by JPercivall
Github user busbey commented on a diff in the pull request:

    https://github.com/apache/incubator-nifi/pull/24#discussion_r24779192
 
    --- Diff: nifi/nifi-nar-bundles/nifi-kite-bundle/nifi-kite-processors/src/test/java/org/apache/nifi/processors/kite/TestConfigurationProperty.java ---
    @@ -0,0 +1,58 @@
    +package org.apache.nifi.processors.kite;
    +
    --- End diff --
   
    needs a file level license


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] incubator-nifi pull request: NIFI-238: Add Kite processors.

JPercivall
In reply to this post by JPercivall
Github user busbey commented on a diff in the pull request:

    https://github.com/apache/incubator-nifi/pull/24#discussion_r24779198
 
    --- Diff: nifi/nifi-nar-bundles/nifi-kite-bundle/nifi-kite-processors/src/test/java/org/apache/nifi/processors/kite/TestCSVToAvroProcessor.java ---
    @@ -0,0 +1,107 @@
    +package org.apache.nifi.processors.kite;
    +
    --- End diff --
   
    needs a file level license


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] incubator-nifi pull request: NIFI-238: Add Kite processors.

JPercivall
In reply to this post by JPercivall
Github user busbey commented on a diff in the pull request:

    https://github.com/apache/incubator-nifi/pull/24#discussion_r24779224
 
    --- Diff: nifi/nifi-nar-bundles/nifi-kite-bundle/nifi-kite-processors/src/main/java/org/kitesdk/data/spi/filesystem/CSVFileReaderFixed.java ---
    @@ -0,0 +1,169 @@
    +/*
    + * Copyright 2013 Cloudera Inc.
    --- End diff --
   
    should be an ASF license block.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] incubator-nifi pull request: NIFI-238: Add Kite processors.

JPercivall
In reply to this post by JPercivall
Github user busbey commented on a diff in the pull request:

    https://github.com/apache/incubator-nifi/pull/24#discussion_r24779233
 
    --- Diff: nifi/nifi-nar-bundles/nifi-kite-bundle/nifi-kite-processors/src/test/java/org/apache/nifi/processors/kite/TestGetSchema.java ---
    @@ -0,0 +1,75 @@
    +package org.apache.nifi.processors.kite;
    +
    --- End diff --
   
    needs a file level license


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] incubator-nifi pull request: NIFI-238: Add Kite processors.

JPercivall
In reply to this post by JPercivall
Github user busbey commented on a diff in the pull request:

    https://github.com/apache/incubator-nifi/pull/24#discussion_r24779389
 
    --- Diff: nifi/nifi-nar-bundles/nifi-kite-bundle/nifi-kite-processors/src/main/java/org/kitesdk/data/spi/filesystem/CSVFileReaderFixed.java ---
    @@ -0,0 +1,169 @@
    +/*
    + * Copyright 2013 Cloudera Inc.
    + *
    + * Licensed under the Apache License, Version 2.0 (the "License");
    + * you may not use this file except in compliance with the License.
    + * You may obtain a copy of the License at
    + *
    + *   http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.kitesdk.data.spi.filesystem;
    +
    +import au.com.bytecode.opencsv.CSVReader;
    +import java.io.InputStream;
    +import com.google.common.collect.Lists;
    +import java.util.List;
    +import org.apache.hadoop.mapreduce.InputSplit;
    +import org.apache.hadoop.mapreduce.RecordReader;
    +import org.apache.hadoop.mapreduce.TaskAttemptContext;
    +import org.kitesdk.data.DatasetDescriptor;
    +import org.kitesdk.data.DatasetIOException;
    +import org.kitesdk.data.spi.AbstractDatasetReader;
    +import org.kitesdk.data.spi.DescriptorUtil;
    +import org.kitesdk.data.spi.EntityAccessor;
    +import org.kitesdk.data.spi.ReaderWriterState;
    +import com.google.common.base.Preconditions;
    +import org.apache.avro.Schema;
    +import org.apache.hadoop.fs.FSDataInputStream;
    +import org.apache.hadoop.fs.FileSystem;
    +import org.apache.hadoop.fs.Path;
    +import org.kitesdk.data.spi.filesystem.CSVProperties;
    +import org.kitesdk.data.spi.filesystem.CSVUtil;
    +import org.slf4j.Logger;
    +import org.slf4j.LoggerFactory;
    +
    +import java.io.IOException;
    +import java.util.NoSuchElementException;
    +
    +import static org.kitesdk.data.spi.filesystem.FileSystemProperties.REUSE_RECORDS;
    +
    +/**
    + * This is a temporary addition. The version in 0.18.0 throws a NPE when the
    + * InputStream constructor is used.
    + */
    +public class CSVFileReaderFixed<E> extends AbstractDatasetReader<E> {
    --- End diff --
   
    What happens if someone is using this version of the reader but drops in a later version of Kite?
   
    Can we make this a non-public class that isn't in the package space of Kite? Could we make it a smaller class that inherited from the Kite class and corrected just the specific behavior we're trying to avoid?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] incubator-nifi pull request: NIFI-238: Add Kite processors.

JPercivall
In reply to this post by JPercivall
Github user busbey commented on a diff in the pull request:

    https://github.com/apache/incubator-nifi/pull/24#discussion_r24779458
 
    --- Diff: nifi/nifi-nar-bundles/nifi-kite-bundle/nifi-kite-processors/src/main/java/org/apache/nifi/processors/kite/JSONFileReader.java ---
    @@ -0,0 +1,111 @@
    +/*
    + * Copyright 2015 Cloudera Inc.
    --- End diff --
   
    Should be an ASF license.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] incubator-nifi pull request: NIFI-238: Add Kite processors.

JPercivall
In reply to this post by JPercivall
Github user busbey commented on a diff in the pull request:

    https://github.com/apache/incubator-nifi/pull/24#discussion_r24779489
 
    --- Diff: nifi/nifi-nar-bundles/nifi-kite-bundle/nifi-kite-processors/src/main/java/org/apache/nifi/processors/kite/JSONFileReader.java ---
    @@ -0,0 +1,111 @@
    +/*
    + * Copyright 2015 Cloudera Inc.
    + *
    + * Licensed under the Apache License, Version 2.0 (the "License");
    + * you may not use this file except in compliance with the License.
    + * You may obtain a copy of the License at
    + *
    + *   http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.nifi.processors.kite;
    +
    +import com.fasterxml.jackson.databind.JsonNode;
    +import com.google.common.base.Function;
    +import com.google.common.base.Preconditions;
    +import com.google.common.collect.Iterators;
    +import java.io.IOException;
    +import java.io.InputStream;
    +import java.util.Iterator;
    +import javax.annotation.Nullable;
    +import org.apache.avro.Schema;
    +import org.apache.avro.generic.GenericData;
    +import org.kitesdk.data.DatasetIOException;
    +import org.kitesdk.data.spi.AbstractDatasetReader;
    +import org.kitesdk.data.spi.DataModelUtil;
    +import org.kitesdk.data.spi.JsonUtil;
    +import org.kitesdk.data.spi.ReaderWriterState;
    +
    +/**
    + * This is a temporary addition. The version in 0.18.0 throws a NPE when the
    + * InputStream constructor is used.
    + */
    +public class JSONFileReader<E> extends AbstractDatasetReader<E> {
    --- End diff --
   
    What happens if someone is using this version of the reader but drops in a later version of Kite?
   
    Can we make this a non-public class that isn't in the package space of Kite? Could we make it a smaller class that inherited from the Kite class and corrected just the specific behavior we're trying to avoid?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] incubator-nifi pull request: NIFI-238: Add Kite processors.

JPercivall
In reply to this post by JPercivall
Github user busbey commented on a diff in the pull request:

    https://github.com/apache/incubator-nifi/pull/24#discussion_r24779501
 
    --- Diff: nifi/nifi-nar-bundles/nifi-kite-bundle/nifi-kite-processors/src/main/java/org/apache/nifi/processors/kite/AvroUtil.java ---
    @@ -0,0 +1,24 @@
    +package org.apache.nifi.processors.kite;
    +
    --- End diff --
   
    needs a file-level license


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] incubator-nifi pull request: NIFI-238: Add Kite processors.

JPercivall
In reply to this post by JPercivall
Github user joewitt commented on a diff in the pull request:

    https://github.com/apache/incubator-nifi/pull/24#discussion_r24779552
 
    --- Diff: nifi/nifi-nar-bundles/nifi-kite-bundle/pom.xml ---
    @@ -0,0 +1,60 @@
    +<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
    +  <!--
    +    Licensed to the Apache Software Foundation (ASF) under one or more
    +    contributor license agreements.  See the NOTICE file distributed with
    +    this work for additional information regarding copyright ownership.
    +    The ASF licenses this file to You under the Apache License, Version 2.0
    +    (the "License"); you may not use this file except in compliance with
    +    the License.  You may obtain a copy of the License at
    +        http://www.apache.org/licenses/LICENSE-2.0
    +    Unless required by applicable law or agreed to in writing, software
    +    distributed under the License is distributed on an "AS IS" BASIS,
    +    WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    +    See the License for the specific language governing permissions and
    +    limitations under the License.
    +  -->
    +  <modelVersion>4.0.0</modelVersion>
    +
    +  <parent>
    +    <groupId>org.apache.nifi</groupId>
    +    <artifactId>nifi-nar-bundles</artifactId>
    +    <version>0.0.1-incubating-SNAPSHOT</version>
    +  </parent>
    +
    +  <artifactId>nifi-kite-bundle</artifactId>
    +  <version>0.0.1-incubating-SNAPSHOT</version>
    --- End diff --
   
    You can avoid the artifact version itself but I believe you must reference the parent version.  Variables for versions work fine for 3P deps but less so for items within the multi-module build.  The release plugin takes care of version changes for those though so this is not an issue.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] incubator-nifi pull request: NIFI-238: Add Kite processors.

JPercivall
In reply to this post by JPercivall
Github user busbey commented on a diff in the pull request:

    https://github.com/apache/incubator-nifi/pull/24#discussion_r24779558
 
    --- Diff: nifi/nifi-nar-bundles/nifi-kite-bundle/nifi-kite-processors/src/main/java/org/apache/nifi/processors/kite/AvroUtil.java ---
    @@ -0,0 +1,24 @@
    +package org.apache.nifi.processors.kite;
    +
    +import org.apache.avro.Schema;
    +import org.apache.avro.SchemaBuilder;
    +import org.apache.avro.generic.GenericData;
    +import org.apache.avro.generic.IndexedRecord;
    +import org.apache.avro.io.DatumReader;
    +import org.apache.avro.io.DatumWriter;
    +
    +import static org.apache.avro.generic.GenericData.StringType;
    +
    +public class AvroUtil {
    --- End diff --
   
    make package-private


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] incubator-nifi pull request: NIFI-238: Add Kite processors.

JPercivall
In reply to this post by JPercivall
Github user busbey commented on a diff in the pull request:

    https://github.com/apache/incubator-nifi/pull/24#discussion_r24779991
 
    --- Diff: nifi/nifi-nar-bundles/nifi-kite-bundle/nifi-kite-processors/pom.xml ---
    @@ -0,0 +1,160 @@
    +<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
    +  <!--
    +    Licensed to the Apache Software Foundation (ASF) under one or more
    +    contributor license agreements.  See the NOTICE file distributed with
    +    this work for additional information regarding copyright ownership.
    +    The ASF licenses this file to You under the Apache License, Version 2.0
    +    (the "License"); you may not use this file except in compliance with
    +    the License.  You may obtain a copy of the License at
    +        http://www.apache.org/licenses/LICENSE-2.0
    +    Unless required by applicable law or agreed to in writing, software
    +    distributed under the License is distributed on an "AS IS" BASIS,
    +    WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    +    See the License for the specific language governing permissions and
    +    limitations under the License.
    +  -->
    +  <modelVersion>4.0.0</modelVersion>
    +
    +  <parent>
    +    <groupId>org.apache.nifi</groupId>
    +    <artifactId>nifi-kite-bundle</artifactId>
    +    <version>0.0.1-incubating-SNAPSHOT</version>
    +  </parent>
    +
    +  <artifactId>nifi-kite-processors</artifactId>
    +  <packaging>jar</packaging>
    +  <name>Kite Hadoop Processors</name>
    +
    +  <properties>
    +    <kite.version>0.18.0</kite.version>
    +    <guava.version>11.0.2</guava.version>
    +    <jsr305.version>2.0.1</jsr305.version>
    +    <junit.version>4.10</junit.version>
    +  </properties>
    +
    +  <dependencies>
    +    <!-- NiFi -->
    +
    +    <dependency>
    +      <groupId>org.apache.nifi</groupId>
    +      <artifactId>nifi-api</artifactId>
    +    </dependency>
    +    <dependency>
    +      <groupId>org.apache.nifi</groupId>
    +      <artifactId>nifi-utils</artifactId>
    +    </dependency>
    +    <dependency>
    +      <groupId>org.apache.nifi</groupId>
    +      <artifactId>nifi-processor-utils</artifactId>
    +    </dependency>
    +    <dependency>
    +      <groupId>org.apache.nifi</groupId>
    +      <artifactId>nifi-flowfile-packager</artifactId>
    +    </dependency>
    +
    +    <!-- Kite -->
    +
    +    <dependency>
    +      <groupId>org.kitesdk</groupId>
    +      <artifactId>kite-data-core</artifactId>
    +      <version>${kite.version}</version>
    +    </dependency>
    +
    +    <dependency>
    +      <groupId>org.kitesdk</groupId>
    +      <artifactId>kite-data-hive</artifactId>
    +      <version>${kite.version}</version>
    +    </dependency>
    +
    +    <dependency>
    +      <groupId>org.kitesdk</groupId>
    +      <artifactId>kite-data-hbase</artifactId>
    +      <version>${kite.version}</version>
    +    </dependency>
    +
    +    <dependency>
    +      <groupId>org.kitesdk</groupId>
    +      <artifactId>kite-hadoop-dependencies</artifactId>
    +      <type>pom</type>
    +      <version>${kite.version}</version>
    +    </dependency>
    +
    +    <dependency>
    +      <groupId>org.kitesdk</groupId>
    +      <artifactId>kite-hbase-dependencies</artifactId>
    +      <type>pom</type>
    +      <scope>provided</scope>
    +      <optional>true</optional>
    +      <version>${kite.version}</version>
    +    </dependency>
    +
    +    <dependency>
    +      <groupId>com.google.guava</groupId>
    +      <artifactId>guava</artifactId>
    +      <version>${guava.version}</version>
    +      <scope>compile</scope>
    +    </dependency>
    +
    +    <dependency>
    +      <!-- avoid warnings by bundling annotations -->
    +      <groupId>com.google.code.findbugs</groupId>
    +      <artifactId>jsr305</artifactId>
    +      <scope>compile</scope>
    +      <version>${jsr305.version}</version>
    +    </dependency>
    +
    +    <dependency>
    +      <!-- avoid warnings by bundling annotations -->
    +      <groupId>com.google.code.findbugs</groupId>
    +      <artifactId>annotations</artifactId>
    --- End diff --
   
    This is an LGPL dependency, so you can't include it. ([ref asf license guidelines](http://www.apache.org/legal/resolved.html#category-x))
   
    There is a cleanroom reimplementation that is appropriately licensed: https://github.com/stephenc/findbugs-annotations


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
12