[DISCUSS] Deprecate processors who have Record oriented counterpart?

classic Classic list List threaded Threaded
10 messages Options
Reply | Threaded
Open this post in threaded view
|

[DISCUSS] Deprecate processors who have Record oriented counterpart?

Sivaprasanna Sethuraman
Team,

Ever since the Record based processors were first introduced, there has
been active development in improving the Record APIs and constant interest
in introducing new set of Record oriented processors. It has gone to a
level where almost all the processors that deal with mainstream tech have a
Record based counterpart, such as the processors for MongoDB, Kafka, RDBMS,
HBase, etc., These record based processors have overcome the limitations of
the standard processors letting us build flows which are concise and
efficient especially when we are dealing with structured data. And more
over with the recent release of NiFi (1.9), we now have a new feature that
offers schema inference capability which even simplifies the process of
building flows with such processors. Having said that, I'm wondering if
this is a right time to raise the talk of deprecating processors which the
community believes has a much better record oriented counterpart, covering
all the functionalities currently offered by the standard processor.

There are a few things that has to be talked about, like how should the
deprecated processor be displayed in the UI, etc., but even before going
through that route, I want to understand the community's thoughts on this.

Thanks,
Sivaprasanna
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Deprecate processors who have Record oriented counterpart?

Andrew Grande-2
I'm not sure deprecating is warranted. In my experience, record based
processors are very powerful, but have a steep learning curve the way they
are in NiFi today, and, frankly, simple things should be dead simple.

Now, moving the record UX towards an easy extreme affects this equation,
but e.g. I never open up a conversation with a new user by talking about
records, Schema Registry or NiFi Registry.

Maybe there's something coming up which I'm not aware yet? Please share.

Andrew

On Sat, Feb 23, 2019, 7:43 AM Sivaprasanna <[hidden email]>
wrote:

> Team,
>
> Ever since the Record based processors were first introduced, there has
> been active development in improving the Record APIs and constant interest
> in introducing new set of Record oriented processors. It has gone to a
> level where almost all the processors that deal with mainstream tech have a
> Record based counterpart, such as the processors for MongoDB, Kafka, RDBMS,
> HBase, etc., These record based processors have overcome the limitations of
> the standard processors letting us build flows which are concise and
> efficient especially when we are dealing with structured data. And more
> over with the recent release of NiFi (1.9), we now have a new feature that
> offers schema inference capability which even simplifies the process of
> building flows with such processors. Having said that, I'm wondering if
> this is a right time to raise the talk of deprecating processors which the
> community believes has a much better record oriented counterpart, covering
> all the functionalities currently offered by the standard processor.
>
> There are a few things that has to be talked about, like how should the
> deprecated processor be displayed in the UI, etc., but even before going
> through that route, I want to understand the community's thoughts on this.
>
> Thanks,
> Sivaprasanna
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Deprecate processors who have Record oriented counterpart?

Ryan Hendrickson-2
We often don't use the Record Processors because of the Schema requirement
and complexity to use the LookupRecord processor.

I'll refer to this email in the NiFi mailing list: "GetMongo - Pass-on
Initial FlowFile?"... There were suggestions to use the LookupRecord
processor, but ultimately it couldn't do what we needed to be done, so we
had to string together a set of other processors.

For us, it was easier to string together a set of processors than to figure
out why LookupRecord, MongoDBLookupService, and InferAvroSchema wasn't
getting the job done for us.
                 /---success---> *ReplaceText* (Prepend JSON Key)
---success-->  \
                /
                                                \
*GetMongo*
                                          -------> *Merge Content* (Combine
on Correlation Attribute Name, Binary Concat)
                \
                                                /
                 \---original---> *ReplaceText*  (Prepend JSON Key)
---success--> /


If they're marked as deprecated, I'd really like to see barrier to entry
with the LookupRecord processors decreased.  The number 1 thing I don't
like about the Record processors is that they require a Schema, and the
complimentary processor(s?), specifically the GetMongo one, does not
require a schema.

Ryan

On Sat, Feb 23, 2019 at 11:39 AM Andrew Grande <[hidden email]> wrote:

> I'm not sure deprecating is warranted. In my experience, record based
> processors are very powerful, but have a steep learning curve the way they
> are in NiFi today, and, frankly, simple things should be dead simple.
>
> Now, moving the record UX towards an easy extreme affects this equation,
> but e.g. I never open up a conversation with a new user by talking about
> records, Schema Registry or NiFi Registry.
>
> Maybe there's something coming up which I'm not aware yet? Please share.
>
> Andrew
>
> On Sat, Feb 23, 2019, 7:43 AM Sivaprasanna <[hidden email]>
> wrote:
>
> > Team,
> >
> > Ever since the Record based processors were first introduced, there has
> > been active development in improving the Record APIs and constant
> interest
> > in introducing new set of Record oriented processors. It has gone to a
> > level where almost all the processors that deal with mainstream tech
> have a
> > Record based counterpart, such as the processors for MongoDB, Kafka,
> RDBMS,
> > HBase, etc., These record based processors have overcome the limitations
> of
> > the standard processors letting us build flows which are concise and
> > efficient especially when we are dealing with structured data. And more
> > over with the recent release of NiFi (1.9), we now have a new feature
> that
> > offers schema inference capability which even simplifies the process of
> > building flows with such processors. Having said that, I'm wondering if
> > this is a right time to raise the talk of deprecating processors which
> the
> > community believes has a much better record oriented counterpart,
> covering
> > all the functionalities currently offered by the standard processor.
> >
> > There are a few things that has to be talked about, like how should the
> > deprecated processor be displayed in the UI, etc., but even before going
> > through that route, I want to understand the community's thoughts on
> this.
> >
> > Thanks,
> > Sivaprasanna
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Deprecate processors who have Record oriented counterpart?

Lars Francke
I'm also against deprecation.

Sometimes it's nice to throw a quick workflow together where I don't care
about schemas at all.




On Sat, Feb 23, 2019, 18:06 Ryan Hendrickson <
[hidden email]> wrote:

> We often don't use the Record Processors because of the Schema requirement
> and complexity to use the LookupRecord processor.
>
> I'll refer to this email in the NiFi mailing list: "GetMongo - Pass-on
> Initial FlowFile?"... There were suggestions to use the LookupRecord
> processor, but ultimately it couldn't do what we needed to be done, so we
> had to string together a set of other processors.
>
> For us, it was easier to string together a set of processors than to figure
> out why LookupRecord, MongoDBLookupService, and InferAvroSchema wasn't
> getting the job done for us.
>                  /---success---> *ReplaceText* (Prepend JSON Key)
> ---success-->  \
>                 /
>                                                 \
> *GetMongo*
>                                           -------> *Merge Content* (Combine
> on Correlation Attribute Name, Binary Concat)
>                 \
>                                                 /
>                  \---original---> *ReplaceText*  (Prepend JSON Key)
> ---success--> /
>
>
> If they're marked as deprecated, I'd really like to see barrier to entry
> with the LookupRecord processors decreased.  The number 1 thing I don't
> like about the Record processors is that they require a Schema, and the
> complimentary processor(s?), specifically the GetMongo one, does not
> require a schema.
>
> Ryan
>
> On Sat, Feb 23, 2019 at 11:39 AM Andrew Grande <[hidden email]> wrote:
>
> > I'm not sure deprecating is warranted. In my experience, record based
> > processors are very powerful, but have a steep learning curve the way
> they
> > are in NiFi today, and, frankly, simple things should be dead simple.
> >
> > Now, moving the record UX towards an easy extreme affects this equation,
> > but e.g. I never open up a conversation with a new user by talking about
> > records, Schema Registry or NiFi Registry.
> >
> > Maybe there's something coming up which I'm not aware yet? Please share.
> >
> > Andrew
> >
> > On Sat, Feb 23, 2019, 7:43 AM Sivaprasanna <[hidden email]>
> > wrote:
> >
> > > Team,
> > >
> > > Ever since the Record based processors were first introduced, there has
> > > been active development in improving the Record APIs and constant
> > interest
> > > in introducing new set of Record oriented processors. It has gone to a
> > > level where almost all the processors that deal with mainstream tech
> > have a
> > > Record based counterpart, such as the processors for MongoDB, Kafka,
> > RDBMS,
> > > HBase, etc., These record based processors have overcome the
> limitations
> > of
> > > the standard processors letting us build flows which are concise and
> > > efficient especially when we are dealing with structured data. And more
> > > over with the recent release of NiFi (1.9), we now have a new feature
> > that
> > > offers schema inference capability which even simplifies the process of
> > > building flows with such processors. Having said that, I'm wondering if
> > > this is a right time to raise the talk of deprecating processors which
> > the
> > > community believes has a much better record oriented counterpart,
> > covering
> > > all the functionalities currently offered by the standard processor.
> > >
> > > There are a few things that has to be talked about, like how should the
> > > deprecated processor be displayed in the UI, etc., but even before
> going
> > > through that route, I want to understand the community's thoughts on
> > this.
> > >
> > > Thanks,
> > > Sivaprasanna
> > >
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Deprecate processors who have Record oriented counterpart?

Mike Thomsen
In reply to this post by Ryan Hendrickson-2
> The number 1 thing I don't like about the Record processors is that they
require a Schema, and the complimentary processor(s?), specifically the
GetMongo one, does not require a schema.

FWIW, we just added GetMongoRecord in 1.9.0, along with GridFS processors.

I'll note that arguably the best reason for you to take the dive into being
able to use the Record API w/ Mongo is precisely that Mongo doesn't even
have schema on write. It's entirely possible that 9 out of 10 people on
your team write a date the right way you agreed upon and then the 1 hold
out does the polar opposite and you won't know until random, bizarre
behavior shows up.

On Sat, Feb 23, 2019 at 12:06 PM Ryan Hendrickson <
[hidden email]> wrote:

> We often don't use the Record Processors because of the Schema requirement
> and complexity to use the LookupRecord processor.
>
> I'll refer to this email in the NiFi mailing list: "GetMongo - Pass-on
> Initial FlowFile?"... There were suggestions to use the LookupRecord
> processor, but ultimately it couldn't do what we needed to be done, so we
> had to string together a set of other processors.
>
> For us, it was easier to string together a set of processors than to figure
> out why LookupRecord, MongoDBLookupService, and InferAvroSchema wasn't
> getting the job done for us.
>                  /---success---> *ReplaceText* (Prepend JSON Key)
> ---success-->  \
>                 /
>                                                 \
> *GetMongo*
>                                           -------> *Merge Content* (Combine
> on Correlation Attribute Name, Binary Concat)
>                 \
>                                                 /
>                  \---original---> *ReplaceText*  (Prepend JSON Key)
> ---success--> /
>
>
> If they're marked as deprecated, I'd really like to see barrier to entry
> with the LookupRecord processors decreased.  The number 1 thing I don't
> like about the Record processors is that they require a Schema, and the
> complimentary processor(s?), specifically the GetMongo one, does not
> require a schema.
>
> Ryan
>
> On Sat, Feb 23, 2019 at 11:39 AM Andrew Grande <[hidden email]> wrote:
>
> > I'm not sure deprecating is warranted. In my experience, record based
> > processors are very powerful, but have a steep learning curve the way
> they
> > are in NiFi today, and, frankly, simple things should be dead simple.
> >
> > Now, moving the record UX towards an easy extreme affects this equation,
> > but e.g. I never open up a conversation with a new user by talking about
> > records, Schema Registry or NiFi Registry.
> >
> > Maybe there's something coming up which I'm not aware yet? Please share.
> >
> > Andrew
> >
> > On Sat, Feb 23, 2019, 7:43 AM Sivaprasanna <[hidden email]>
> > wrote:
> >
> > > Team,
> > >
> > > Ever since the Record based processors were first introduced, there has
> > > been active development in improving the Record APIs and constant
> > interest
> > > in introducing new set of Record oriented processors. It has gone to a
> > > level where almost all the processors that deal with mainstream tech
> > have a
> > > Record based counterpart, such as the processors for MongoDB, Kafka,
> > RDBMS,
> > > HBase, etc., These record based processors have overcome the
> limitations
> > of
> > > the standard processors letting us build flows which are concise and
> > > efficient especially when we are dealing with structured data. And more
> > > over with the recent release of NiFi (1.9), we now have a new feature
> > that
> > > offers schema inference capability which even simplifies the process of
> > > building flows with such processors. Having said that, I'm wondering if
> > > this is a right time to raise the talk of deprecating processors which
> > the
> > > community believes has a much better record oriented counterpart,
> > covering
> > > all the functionalities currently offered by the standard processor.
> > >
> > > There are a few things that has to be talked about, like how should the
> > > deprecated processor be displayed in the UI, etc., but even before
> going
> > > through that route, I want to understand the community's thoughts on
> > this.
> > >
> > > Thanks,
> > > Sivaprasanna
> > >
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Deprecate processors who have Record oriented counterpart?

Mike Thomsen
Sivaprasanna,

FWIW, I think there might be merit to deprecating converting to Avro, but
the rest I think should stay. With Avro, I feel like there is intrinsic
danger in giving people that option if they're unwilling to learn how to
write an Avro schema.

On Sat, Feb 23, 2019 at 1:21 PM Mike Thomsen <[hidden email]> wrote:

> > The number 1 thing I don't like about the Record processors is that they
> require a Schema, and the complimentary processor(s?), specifically the
> GetMongo one, does not require a schema.
>
> FWIW, we just added GetMongoRecord in 1.9.0, along with GridFS processors.
>
> I'll note that arguably the best reason for you to take the dive into
> being able to use the Record API w/ Mongo is precisely that Mongo doesn't
> even have schema on write. It's entirely possible that 9 out of 10 people
> on your team write a date the right way you agreed upon and then the 1 hold
> out does the polar opposite and you won't know until random, bizarre
> behavior shows up.
>
> On Sat, Feb 23, 2019 at 12:06 PM Ryan Hendrickson <
> [hidden email]> wrote:
>
>> We often don't use the Record Processors because of the Schema requirement
>> and complexity to use the LookupRecord processor.
>>
>> I'll refer to this email in the NiFi mailing list: "GetMongo - Pass-on
>> Initial FlowFile?"... There were suggestions to use the LookupRecord
>> processor, but ultimately it couldn't do what we needed to be done, so we
>> had to string together a set of other processors.
>>
>> For us, it was easier to string together a set of processors than to
>> figure
>> out why LookupRecord, MongoDBLookupService, and InferAvroSchema wasn't
>> getting the job done for us.
>>                  /---success---> *ReplaceText* (Prepend JSON Key)
>> ---success-->  \
>>                 /
>>                                                 \
>> *GetMongo*
>>                                           -------> *Merge Content*
>> (Combine
>> on Correlation Attribute Name, Binary Concat)
>>                 \
>>                                                 /
>>                  \---original---> *ReplaceText*  (Prepend JSON Key)
>> ---success--> /
>>
>>
>> If they're marked as deprecated, I'd really like to see barrier to entry
>> with the LookupRecord processors decreased.  The number 1 thing I don't
>> like about the Record processors is that they require a Schema, and the
>> complimentary processor(s?), specifically the GetMongo one, does not
>> require a schema.
>>
>> Ryan
>>
>> On Sat, Feb 23, 2019 at 11:39 AM Andrew Grande <[hidden email]>
>> wrote:
>>
>> > I'm not sure deprecating is warranted. In my experience, record based
>> > processors are very powerful, but have a steep learning curve the way
>> they
>> > are in NiFi today, and, frankly, simple things should be dead simple.
>> >
>> > Now, moving the record UX towards an easy extreme affects this equation,
>> > but e.g. I never open up a conversation with a new user by talking about
>> > records, Schema Registry or NiFi Registry.
>> >
>> > Maybe there's something coming up which I'm not aware yet? Please share.
>> >
>> > Andrew
>> >
>> > On Sat, Feb 23, 2019, 7:43 AM Sivaprasanna <[hidden email]>
>> > wrote:
>> >
>> > > Team,
>> > >
>> > > Ever since the Record based processors were first introduced, there
>> has
>> > > been active development in improving the Record APIs and constant
>> > interest
>> > > in introducing new set of Record oriented processors. It has gone to a
>> > > level where almost all the processors that deal with mainstream tech
>> > have a
>> > > Record based counterpart, such as the processors for MongoDB, Kafka,
>> > RDBMS,
>> > > HBase, etc., These record based processors have overcome the
>> limitations
>> > of
>> > > the standard processors letting us build flows which are concise and
>> > > efficient especially when we are dealing with structured data. And
>> more
>> > > over with the recent release of NiFi (1.9), we now have a new feature
>> > that
>> > > offers schema inference capability which even simplifies the process
>> of
>> > > building flows with such processors. Having said that, I'm wondering
>> if
>> > > this is a right time to raise the talk of deprecating processors which
>> > the
>> > > community believes has a much better record oriented counterpart,
>> > covering
>> > > all the functionalities currently offered by the standard processor.
>> > >
>> > > There are a few things that has to be talked about, like how should
>> the
>> > > deprecated processor be displayed in the UI, etc., but even before
>> going
>> > > through that route, I want to understand the community's thoughts on
>> > this.
>> > >
>> > > Thanks,
>> > > Sivaprasanna
>> > >
>> >
>>
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Deprecate processors who have Record oriented counterpart?

Bryan Bende
One thing I would add is that in the 1.9.0 release there is now schema
inference built in so that you can just start using the record processors
without having a schema.

That being said I am neutral about deprecating the non-record processors
for source and destination systems.

The processors I would definitely be in favor of deprecating are the
conversion processors that are replaced by ConvertRecord (Avro to JSON,
JSON to Avro, csv to avro, whatever other combos) and InferAvroSchema. All
of those should be handled by ConvertRecord + the built in schema inference
option in the readers and writers.

On Sat, Feb 23, 2019 at 1:23 PM Mike Thomsen <[hidden email]> wrote:

> Sivaprasanna,
>
> FWIW, I think there might be merit to deprecating converting to Avro, but
> the rest I think should stay. With Avro, I feel like there is intrinsic
> danger in giving people that option if they're unwilling to learn how to
> write an Avro schema.
>
> On Sat, Feb 23, 2019 at 1:21 PM Mike Thomsen <[hidden email]>
> wrote:
>
> > > The number 1 thing I don't like about the Record processors is that
> they
> > require a Schema, and the complimentary processor(s?), specifically the
> > GetMongo one, does not require a schema.
> >
> > FWIW, we just added GetMongoRecord in 1.9.0, along with GridFS
> processors.
> >
> > I'll note that arguably the best reason for you to take the dive into
> > being able to use the Record API w/ Mongo is precisely that Mongo doesn't
> > even have schema on write. It's entirely possible that 9 out of 10 people
> > on your team write a date the right way you agreed upon and then the 1
> hold
> > out does the polar opposite and you won't know until random, bizarre
> > behavior shows up.
> >
> > On Sat, Feb 23, 2019 at 12:06 PM Ryan Hendrickson <
> > [hidden email]> wrote:
> >
> >> We often don't use the Record Processors because of the Schema
> requirement
> >> and complexity to use the LookupRecord processor.
> >>
> >> I'll refer to this email in the NiFi mailing list: "GetMongo - Pass-on
> >> Initial FlowFile?"... There were suggestions to use the LookupRecord
> >> processor, but ultimately it couldn't do what we needed to be done, so
> we
> >> had to string together a set of other processors.
> >>
> >> For us, it was easier to string together a set of processors than to
> >> figure
> >> out why LookupRecord, MongoDBLookupService, and InferAvroSchema wasn't
> >> getting the job done for us.
> >>                  /---success---> *ReplaceText* (Prepend JSON Key)
> >> ---success-->  \
> >>                 /
> >>                                                 \
> >> *GetMongo*
> >>                                           -------> *Merge Content*
> >> (Combine
> >> on Correlation Attribute Name, Binary Concat)
> >>                 \
> >>                                                 /
> >>                  \---original---> *ReplaceText*  (Prepend JSON Key)
> >> ---success--> /
> >>
> >>
> >> If they're marked as deprecated, I'd really like to see barrier to entry
> >> with the LookupRecord processors decreased.  The number 1 thing I don't
> >> like about the Record processors is that they require a Schema, and the
> >> complimentary processor(s?), specifically the GetMongo one, does not
> >> require a schema.
> >>
> >> Ryan
> >>
> >> On Sat, Feb 23, 2019 at 11:39 AM Andrew Grande <[hidden email]>
> >> wrote:
> >>
> >> > I'm not sure deprecating is warranted. In my experience, record based
> >> > processors are very powerful, but have a steep learning curve the way
> >> they
> >> > are in NiFi today, and, frankly, simple things should be dead simple.
> >> >
> >> > Now, moving the record UX towards an easy extreme affects this
> equation,
> >> > but e.g. I never open up a conversation with a new user by talking
> about
> >> > records, Schema Registry or NiFi Registry.
> >> >
> >> > Maybe there's something coming up which I'm not aware yet? Please
> share.
> >> >
> >> > Andrew
> >> >
> >> > On Sat, Feb 23, 2019, 7:43 AM Sivaprasanna <[hidden email]
> >
> >> > wrote:
> >> >
> >> > > Team,
> >> > >
> >> > > Ever since the Record based processors were first introduced, there
> >> has
> >> > > been active development in improving the Record APIs and constant
> >> > interest
> >> > > in introducing new set of Record oriented processors. It has gone
> to a
> >> > > level where almost all the processors that deal with mainstream tech
> >> > have a
> >> > > Record based counterpart, such as the processors for MongoDB, Kafka,
> >> > RDBMS,
> >> > > HBase, etc., These record based processors have overcome the
> >> limitations
> >> > of
> >> > > the standard processors letting us build flows which are concise and
> >> > > efficient especially when we are dealing with structured data. And
> >> more
> >> > > over with the recent release of NiFi (1.9), we now have a new
> feature
> >> > that
> >> > > offers schema inference capability which even simplifies the process
> >> of
> >> > > building flows with such processors. Having said that, I'm wondering
> >> if
> >> > > this is a right time to raise the talk of deprecating processors
> which
> >> > the
> >> > > community believes has a much better record oriented counterpart,
> >> > covering
> >> > > all the functionalities currently offered by the standard processor.
> >> > >
> >> > > There are a few things that has to be talked about, like how should
> >> the
> >> > > deprecated processor be displayed in the UI, etc., but even before
> >> going
> >> > > through that route, I want to understand the community's thoughts on
> >> > this.
> >> > >
> >> > > Thanks,
> >> > > Sivaprasanna
> >> > >
> >> >
> >>
> >
>
--
Sent from Gmail Mobile
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Deprecate processors who have Record oriented counterpart?

Andy LoPresto-2
I think there are legitimate use cases for the “legacy” approaches and we should not deprecate them. However, I do think there can be better education and gentle guidance of new users to prefer the record-oriented processors over the legacy processors when appropriate. Whether this is a linked note in the processor description shown in the Add Processor dialog, improvement documentation on the website, wizard/walkthroughs, etc. is certainly a good topic for conversation here.

The ConvertXtoY processors should definitely be deprecated.


Andy LoPresto
[hidden email]
[hidden email]
PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69

> On Feb 23, 2019, at 12:42 PM, Bryan Bende <[hidden email]> wrote:
>
> One thing I would add is that in the 1.9.0 release there is now schema
> inference built in so that you can just start using the record processors
> without having a schema.
>
> That being said I am neutral about deprecating the non-record processors
> for source and destination systems.
>
> The processors I would definitely be in favor of deprecating are the
> conversion processors that are replaced by ConvertRecord (Avro to JSON,
> JSON to Avro, csv to avro, whatever other combos) and InferAvroSchema. All
> of those should be handled by ConvertRecord + the built in schema inference
> option in the readers and writers.
>
> On Sat, Feb 23, 2019 at 1:23 PM Mike Thomsen <[hidden email]> wrote:
>
>> Sivaprasanna,
>>
>> FWIW, I think there might be merit to deprecating converting to Avro, but
>> the rest I think should stay. With Avro, I feel like there is intrinsic
>> danger in giving people that option if they're unwilling to learn how to
>> write an Avro schema.
>>
>> On Sat, Feb 23, 2019 at 1:21 PM Mike Thomsen <[hidden email]>
>> wrote:
>>
>>>> The number 1 thing I don't like about the Record processors is that
>> they
>>> require a Schema, and the complimentary processor(s?), specifically the
>>> GetMongo one, does not require a schema.
>>>
>>> FWIW, we just added GetMongoRecord in 1.9.0, along with GridFS
>> processors.
>>>
>>> I'll note that arguably the best reason for you to take the dive into
>>> being able to use the Record API w/ Mongo is precisely that Mongo doesn't
>>> even have schema on write. It's entirely possible that 9 out of 10 people
>>> on your team write a date the right way you agreed upon and then the 1
>> hold
>>> out does the polar opposite and you won't know until random, bizarre
>>> behavior shows up.
>>>
>>> On Sat, Feb 23, 2019 at 12:06 PM Ryan Hendrickson <
>>> [hidden email]> wrote:
>>>
>>>> We often don't use the Record Processors because of the Schema
>> requirement
>>>> and complexity to use the LookupRecord processor.
>>>>
>>>> I'll refer to this email in the NiFi mailing list: "GetMongo - Pass-on
>>>> Initial FlowFile?"... There were suggestions to use the LookupRecord
>>>> processor, but ultimately it couldn't do what we needed to be done, so
>> we
>>>> had to string together a set of other processors.
>>>>
>>>> For us, it was easier to string together a set of processors than to
>>>> figure
>>>> out why LookupRecord, MongoDBLookupService, and InferAvroSchema wasn't
>>>> getting the job done for us.
>>>>                 /---success---> *ReplaceText* (Prepend JSON Key)
>>>> ---success-->  \
>>>>                /
>>>>                                                \
>>>> *GetMongo*
>>>>                                          -------> *Merge Content*
>>>> (Combine
>>>> on Correlation Attribute Name, Binary Concat)
>>>>                \
>>>>                                                /
>>>>                 \---original---> *ReplaceText*  (Prepend JSON Key)
>>>> ---success--> /
>>>>
>>>>
>>>> If they're marked as deprecated, I'd really like to see barrier to entry
>>>> with the LookupRecord processors decreased.  The number 1 thing I don't
>>>> like about the Record processors is that they require a Schema, and the
>>>> complimentary processor(s?), specifically the GetMongo one, does not
>>>> require a schema.
>>>>
>>>> Ryan
>>>>
>>>> On Sat, Feb 23, 2019 at 11:39 AM Andrew Grande <[hidden email]>
>>>> wrote:
>>>>
>>>>> I'm not sure deprecating is warranted. In my experience, record based
>>>>> processors are very powerful, but have a steep learning curve the way
>>>> they
>>>>> are in NiFi today, and, frankly, simple things should be dead simple.
>>>>>
>>>>> Now, moving the record UX towards an easy extreme affects this
>> equation,
>>>>> but e.g. I never open up a conversation with a new user by talking
>> about
>>>>> records, Schema Registry or NiFi Registry.
>>>>>
>>>>> Maybe there's something coming up which I'm not aware yet? Please
>> share.
>>>>>
>>>>> Andrew
>>>>>
>>>>> On Sat, Feb 23, 2019, 7:43 AM Sivaprasanna <[hidden email]
>>>
>>>>> wrote:
>>>>>
>>>>>> Team,
>>>>>>
>>>>>> Ever since the Record based processors were first introduced, there
>>>> has
>>>>>> been active development in improving the Record APIs and constant
>>>>> interest
>>>>>> in introducing new set of Record oriented processors. It has gone
>> to a
>>>>>> level where almost all the processors that deal with mainstream tech
>>>>> have a
>>>>>> Record based counterpart, such as the processors for MongoDB, Kafka,
>>>>> RDBMS,
>>>>>> HBase, etc., These record based processors have overcome the
>>>> limitations
>>>>> of
>>>>>> the standard processors letting us build flows which are concise and
>>>>>> efficient especially when we are dealing with structured data. And
>>>> more
>>>>>> over with the recent release of NiFi (1.9), we now have a new
>> feature
>>>>> that
>>>>>> offers schema inference capability which even simplifies the process
>>>> of
>>>>>> building flows with such processors. Having said that, I'm wondering
>>>> if
>>>>>> this is a right time to raise the talk of deprecating processors
>> which
>>>>> the
>>>>>> community believes has a much better record oriented counterpart,
>>>>> covering
>>>>>> all the functionalities currently offered by the standard processor.
>>>>>>
>>>>>> There are a few things that has to be talked about, like how should
>>>> the
>>>>>> deprecated processor be displayed in the UI, etc., but even before
>>>> going
>>>>>> through that route, I want to understand the community's thoughts on
>>>>> this.
>>>>>>
>>>>>> Thanks,
>>>>>> Sivaprasanna
>>>>>>
>>>>>
>>>>
>>>
>>
> --
> Sent from Gmail Mobile

Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Deprecate processors who have Record oriented counterpart?

Mike Thomsen
In the last year, I've joined two teams that were getting started with NiFi
and I think I was the only person on the team that knew about the Record
API.

A few days ago, someone at my client gave a presentation on NiFi and was
talking about the "NiFi forums." Doing a quick Google search for "NiFi
Community Support" showed that HortonWorks's fora are above any
nifi.apache.org reference in priority. So we might have a SEO problem on
our hands too in terms of getting our preferred documentation and guides
into users' hands.

On Mon, Feb 25, 2019 at 3:12 PM Andy LoPresto <[hidden email]> wrote:

> I think there are legitimate use cases for the “legacy” approaches and we
> should not deprecate them. However, I do think there can be better
> education and gentle guidance of new users to prefer the record-oriented
> processors over the legacy processors when appropriate. Whether this is a
> linked note in the processor description shown in the Add Processor dialog,
> improvement documentation on the website, wizard/walkthroughs, etc. is
> certainly a good topic for conversation here.
>
> The ConvertXtoY processors should definitely be deprecated.
>
>
> Andy LoPresto
> [hidden email]
> [hidden email]
> PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69
>
> > On Feb 23, 2019, at 12:42 PM, Bryan Bende <[hidden email]> wrote:
> >
> > One thing I would add is that in the 1.9.0 release there is now schema
> > inference built in so that you can just start using the record processors
> > without having a schema.
> >
> > That being said I am neutral about deprecating the non-record processors
> > for source and destination systems.
> >
> > The processors I would definitely be in favor of deprecating are the
> > conversion processors that are replaced by ConvertRecord (Avro to JSON,
> > JSON to Avro, csv to avro, whatever other combos) and InferAvroSchema.
> All
> > of those should be handled by ConvertRecord + the built in schema
> inference
> > option in the readers and writers.
> >
> > On Sat, Feb 23, 2019 at 1:23 PM Mike Thomsen <[hidden email]>
> wrote:
> >
> >> Sivaprasanna,
> >>
> >> FWIW, I think there might be merit to deprecating converting to Avro,
> but
> >> the rest I think should stay. With Avro, I feel like there is intrinsic
> >> danger in giving people that option if they're unwilling to learn how to
> >> write an Avro schema.
> >>
> >> On Sat, Feb 23, 2019 at 1:21 PM Mike Thomsen <[hidden email]>
> >> wrote:
> >>
> >>>> The number 1 thing I don't like about the Record processors is that
> >> they
> >>> require a Schema, and the complimentary processor(s?), specifically the
> >>> GetMongo one, does not require a schema.
> >>>
> >>> FWIW, we just added GetMongoRecord in 1.9.0, along with GridFS
> >> processors.
> >>>
> >>> I'll note that arguably the best reason for you to take the dive into
> >>> being able to use the Record API w/ Mongo is precisely that Mongo
> doesn't
> >>> even have schema on write. It's entirely possible that 9 out of 10
> people
> >>> on your team write a date the right way you agreed upon and then the 1
> >> hold
> >>> out does the polar opposite and you won't know until random, bizarre
> >>> behavior shows up.
> >>>
> >>> On Sat, Feb 23, 2019 at 12:06 PM Ryan Hendrickson <
> >>> [hidden email]> wrote:
> >>>
> >>>> We often don't use the Record Processors because of the Schema
> >> requirement
> >>>> and complexity to use the LookupRecord processor.
> >>>>
> >>>> I'll refer to this email in the NiFi mailing list: "GetMongo - Pass-on
> >>>> Initial FlowFile?"... There were suggestions to use the LookupRecord
> >>>> processor, but ultimately it couldn't do what we needed to be done, so
> >> we
> >>>> had to string together a set of other processors.
> >>>>
> >>>> For us, it was easier to string together a set of processors than to
> >>>> figure
> >>>> out why LookupRecord, MongoDBLookupService, and InferAvroSchema wasn't
> >>>> getting the job done for us.
> >>>>                 /---success---> *ReplaceText* (Prepend JSON Key)
> >>>> ---success-->  \
> >>>>                /
> >>>>                                                \
> >>>> *GetMongo*
> >>>>                                          -------> *Merge Content*
> >>>> (Combine
> >>>> on Correlation Attribute Name, Binary Concat)
> >>>>                \
> >>>>                                                /
> >>>>                 \---original---> *ReplaceText*  (Prepend JSON Key)
> >>>> ---success--> /
> >>>>
> >>>>
> >>>> If they're marked as deprecated, I'd really like to see barrier to
> entry
> >>>> with the LookupRecord processors decreased.  The number 1 thing I
> don't
> >>>> like about the Record processors is that they require a Schema, and
> the
> >>>> complimentary processor(s?), specifically the GetMongo one, does not
> >>>> require a schema.
> >>>>
> >>>> Ryan
> >>>>
> >>>> On Sat, Feb 23, 2019 at 11:39 AM Andrew Grande <[hidden email]>
> >>>> wrote:
> >>>>
> >>>>> I'm not sure deprecating is warranted. In my experience, record based
> >>>>> processors are very powerful, but have a steep learning curve the way
> >>>> they
> >>>>> are in NiFi today, and, frankly, simple things should be dead simple.
> >>>>>
> >>>>> Now, moving the record UX towards an easy extreme affects this
> >> equation,
> >>>>> but e.g. I never open up a conversation with a new user by talking
> >> about
> >>>>> records, Schema Registry or NiFi Registry.
> >>>>>
> >>>>> Maybe there's something coming up which I'm not aware yet? Please
> >> share.
> >>>>>
> >>>>> Andrew
> >>>>>
> >>>>> On Sat, Feb 23, 2019, 7:43 AM Sivaprasanna <
> [hidden email]
> >>>
> >>>>> wrote:
> >>>>>
> >>>>>> Team,
> >>>>>>
> >>>>>> Ever since the Record based processors were first introduced, there
> >>>> has
> >>>>>> been active development in improving the Record APIs and constant
> >>>>> interest
> >>>>>> in introducing new set of Record oriented processors. It has gone
> >> to a
> >>>>>> level where almost all the processors that deal with mainstream tech
> >>>>> have a
> >>>>>> Record based counterpart, such as the processors for MongoDB, Kafka,
> >>>>> RDBMS,
> >>>>>> HBase, etc., These record based processors have overcome the
> >>>> limitations
> >>>>> of
> >>>>>> the standard processors letting us build flows which are concise and
> >>>>>> efficient especially when we are dealing with structured data. And
> >>>> more
> >>>>>> over with the recent release of NiFi (1.9), we now have a new
> >> feature
> >>>>> that
> >>>>>> offers schema inference capability which even simplifies the process
> >>>> of
> >>>>>> building flows with such processors. Having said that, I'm wondering
> >>>> if
> >>>>>> this is a right time to raise the talk of deprecating processors
> >> which
> >>>>> the
> >>>>>> community believes has a much better record oriented counterpart,
> >>>>> covering
> >>>>>> all the functionalities currently offered by the standard processor.
> >>>>>>
> >>>>>> There are a few things that has to be talked about, like how should
> >>>> the
> >>>>>> deprecated processor be displayed in the UI, etc., but even before
> >>>> going
> >>>>>> through that route, I want to understand the community's thoughts on
> >>>>> this.
> >>>>>>
> >>>>>> Thanks,
> >>>>>> Sivaprasanna
> >>>>>>
> >>>>>
> >>>>
> >>>
> >>
> > --
> > Sent from Gmail Mobile
>
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Deprecate processors who have Record oriented counterpart?

Otto Fowler
In reply to this post by Andy LoPresto-2
It is probably worth thinking about getting *new* processors as record
processors, as in having a developer focus as well.
If we can think about what we can do to make that better ( we started a
discussion on record / service use in the archetype  ) that would be worth
it.

We can definitely make it easier to build record based processors, new
record readers, new record writers and custom schema registry impls.



On February 25, 2019 at 15:12:05, Andy LoPresto ([hidden email])
wrote:

I think there are legitimate use cases for the “legacy” approaches and we
should not deprecate them. However, I do think there can be better
education and gentle guidance of new users to prefer the record-oriented
processors over the legacy processors when appropriate. Whether this is a
linked note in the processor description shown in the Add Processor dialog,
improvement documentation on the website, wizard/walkthroughs, etc. is
certainly a good topic for conversation here.

The ConvertXtoY processors should definitely be deprecated.


Andy LoPresto
[hidden email]
[hidden email]
PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4 BACE 3C6E F65B 2F7D EF69

> On Feb 23, 2019, at 12:42 PM, Bryan Bende <[hidden email]> wrote:
>
> One thing I would add is that in the 1.9.0 release there is now schema
> inference built in so that you can just start using the record processors
> without having a schema.
>
> That being said I am neutral about deprecating the non-record processors
> for source and destination systems.
>
> The processors I would definitely be in favor of deprecating are the
> conversion processors that are replaced by ConvertRecord (Avro to JSON,
> JSON to Avro, csv to avro, whatever other combos) and InferAvroSchema.
All
> of those should be handled by ConvertRecord + the built in schema
inference
> option in the readers and writers.
>
> On Sat, Feb 23, 2019 at 1:23 PM Mike Thomsen <[hidden email]>
wrote:
>
>> Sivaprasanna,
>>
>> FWIW, I think there might be merit to deprecating converting to Avro,
but

>> the rest I think should stay. With Avro, I feel like there is intrinsic
>> danger in giving people that option if they're unwilling to learn how to
>> write an Avro schema.
>>
>> On Sat, Feb 23, 2019 at 1:21 PM Mike Thomsen <[hidden email]>
>> wrote:
>>
>>>> The number 1 thing I don't like about the Record processors is that
>> they
>>> require a Schema, and the complimentary processor(s?), specifically the
>>> GetMongo one, does not require a schema.
>>>
>>> FWIW, we just added GetMongoRecord in 1.9.0, along with GridFS
>> processors.
>>>
>>> I'll note that arguably the best reason for you to take the dive into
>>> being able to use the Record API w/ Mongo is precisely that Mongo
doesn't
>>> even have schema on write. It's entirely possible that 9 out of 10
people

>>> on your team write a date the right way you agreed upon and then the 1
>> hold
>>> out does the polar opposite and you won't know until random, bizarre
>>> behavior shows up.
>>>
>>> On Sat, Feb 23, 2019 at 12:06 PM Ryan Hendrickson <
>>> [hidden email]> wrote:
>>>
>>>> We often don't use the Record Processors because of the Schema
>> requirement
>>>> and complexity to use the LookupRecord processor.
>>>>
>>>> I'll refer to this email in the NiFi mailing list: "GetMongo - Pass-on
>>>> Initial FlowFile?"... There were suggestions to use the LookupRecord
>>>> processor, but ultimately it couldn't do what we needed to be done, so
>> we
>>>> had to string together a set of other processors.
>>>>
>>>> For us, it was easier to string together a set of processors than to
>>>> figure
>>>> out why LookupRecord, MongoDBLookupService, and InferAvroSchema wasn't
>>>> getting the job done for us.
>>>> /---success---> *ReplaceText* (Prepend JSON Key)
>>>> ---success--> \
>>>> /
>>>> \
>>>> *GetMongo*
>>>> -------> *Merge Content*
>>>> (Combine
>>>> on Correlation Attribute Name, Binary Concat)
>>>> \
>>>> /
>>>> \---original---> *ReplaceText* (Prepend JSON Key)
>>>> ---success--> /
>>>>
>>>>
>>>> If they're marked as deprecated, I'd really like to see barrier to
entry
>>>> with the LookupRecord processors decreased. The number 1 thing I don't
>>>> like about the Record processors is that they require a Schema, and
the

>>>> complimentary processor(s?), specifically the GetMongo one, does not
>>>> require a schema.
>>>>
>>>> Ryan
>>>>
>>>> On Sat, Feb 23, 2019 at 11:39 AM Andrew Grande <[hidden email]>
>>>> wrote:
>>>>
>>>>> I'm not sure deprecating is warranted. In my experience, record based
>>>>> processors are very powerful, but have a steep learning curve the way
>>>> they
>>>>> are in NiFi today, and, frankly, simple things should be dead simple.
>>>>>
>>>>> Now, moving the record UX towards an easy extreme affects this
>> equation,
>>>>> but e.g. I never open up a conversation with a new user by talking
>> about
>>>>> records, Schema Registry or NiFi Registry.
>>>>>
>>>>> Maybe there's something coming up which I'm not aware yet? Please
>> share.
>>>>>
>>>>> Andrew
>>>>>
>>>>> On Sat, Feb 23, 2019, 7:43 AM Sivaprasanna <[hidden email]
>>>
>>>>> wrote:
>>>>>
>>>>>> Team,
>>>>>>
>>>>>> Ever since the Record based processors were first introduced, there
>>>> has
>>>>>> been active development in improving the Record APIs and constant
>>>>> interest
>>>>>> in introducing new set of Record oriented processors. It has gone
>> to a
>>>>>> level where almost all the processors that deal with mainstream tech
>>>>> have a
>>>>>> Record based counterpart, such as the processors for MongoDB, Kafka,
>>>>> RDBMS,
>>>>>> HBase, etc., These record based processors have overcome the
>>>> limitations
>>>>> of
>>>>>> the standard processors letting us build flows which are concise and
>>>>>> efficient especially when we are dealing with structured data. And
>>>> more
>>>>>> over with the recent release of NiFi (1.9), we now have a new
>> feature
>>>>> that
>>>>>> offers schema inference capability which even simplifies the process
>>>> of
>>>>>> building flows with such processors. Having said that, I'm wondering
>>>> if
>>>>>> this is a right time to raise the talk of deprecating processors
>> which
>>>>> the
>>>>>> community believes has a much better record oriented counterpart,
>>>>> covering
>>>>>> all the functionalities currently offered by the standard processor.
>>>>>>
>>>>>> There are a few things that has to be talked about, like how should
>>>> the
>>>>>> deprecated processor be displayed in the UI, etc., but even before
>>>> going
>>>>>> through that route, I want to understand the community's thoughts on
>>>>> this.
>>>>>>
>>>>>> Thanks,
>>>>>> Sivaprasanna
>>>>>>
>>>>>
>>>>
>>>
>>
> --
> Sent from Gmail Mobile