Accumulo processors

classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

Accumulo processors

davidrsmith
Hi

A team at work has a need to interface with accumulo, has anyone tried this, I know a while ago Mark Payne raised nifi jira ticket 818 but as far as I am aware this was never completed. 
I would be grateful if anyone can help or point me in the direction of Mark's code that will give us a start.

Many thanks
Dave




Sent from Samsung tablet
Reply | Threaded
Open this post in threaded view
|

Re: Accumulo processors

Mark Payne
Hi Dave,

I do have a branch in Github with the work that I had done: https://github.com/apache/nifi/tree/NIFI-818
To be perfectly honest, though, I have absolutely no idea what state the code is in, if it's been tested, etc.
But you're welcome to take it and run with it, if you'd like.

Thanks
-Mark


On May 26, 2018, at 12:05 PM, davidrsmith <[hidden email]<mailto:[hidden email]>> wrote:

Hi

A team at work has a need to interface with accumulo, has anyone tried this, I know a while ago Mark Payne raised nifi jira ticket 818 but as far as I am aware this was never completed.
I would be grateful if anyone can help or point me in the direction of Mark's code that will give us a start.

Many thanks
Dave




Sent from Samsung tablet

Reply | Threaded
Open this post in threaded view
|

Re: Accumulo processors

davidrsmith
Mark
Thanks for the link, I have downloaded the code from Github, it will be a good basis to start with.
Many thanksDave
 

    On Saturday, 26 May 2018, 20:22, Mark Payne <[hidden email]> wrote:
 

 Hi Dave,

I do have a branch in Github with the work that I had done: https://github.com/apache/nifi/tree/NIFI-818
To be perfectly honest, though, I have absolutely no idea what state the code is in, if it's been tested, etc.
But you're welcome to take it and run with it, if you'd like.

Thanks
-Mark


On May 26, 2018, at 12:05 PM, davidrsmith <[hidden email]<mailto:[hidden email]>> wrote:

Hi

A team at work has a need to interface with accumulo, has anyone tried this, I know a while ago Mark Payne raised nifi jira ticket 818 but as far as I am aware this was never completed.
I would be grateful if anyone can help or point me in the direction of Mark's code that will give us a start.

Many thanks
Dave




Sent from Samsung tablet


   
Reply | Threaded
Open this post in threaded view
|

Re: Accumulo processors

Mike Thomsen
Dave,

I don't know how far along Mark's code is, but you might find some useful
code you can borrow from the HBase commit(s) that added visibility label
support. For example, there is code for handling visibility labels w/
PutHBaseRecord that might be reusable if you want to create a put record
processor for Accumulo. It won't help you with the Accumulo APIs, but it
could provide useful strategies for how to identify and assign labels from
user input.

On Sun, May 27, 2018 at 3:14 PM DAVID SMITH
<[hidden email]> wrote:

> Mark
> Thanks for the link, I have downloaded the code from Github, it will be a
> good basis to start with.
> Many thanksDave
>
>
>     On Saturday, 26 May 2018, 20:22, Mark Payne <[hidden email]>
> wrote:
>
>
>  Hi Dave,
>
> I do have a branch in Github with the work that I had done:
> https://github.com/apache/nifi/tree/NIFI-818
> To be perfectly honest, though, I have absolutely no idea what state the
> code is in, if it's been tested, etc.
> But you're welcome to take it and run with it, if you'd like.
>
> Thanks
> -Mark
>
>
> On May 26, 2018, at 12:05 PM, davidrsmith <[hidden email]
> .INVALID<mailto:[hidden email]>> wrote:
>
> Hi
>
> A team at work has a need to interface with accumulo, has anyone tried
> this, I know a while ago Mark Payne raised nifi jira ticket 818 but as far
> as I am aware this was never completed.
> I would be grateful if anyone can help or point me in the direction of
> Mark's code that will give us a start.
>
> Many thanks
> Dave
>
>
>
>
> Sent from Samsung tablet
>
>
>
Reply | Threaded
Open this post in threaded view
|

Re: Accumulo processors

davidrsmith
Mike
Thanks for the suggestion I will certainly have a look at the HBase code, I'm not really familiar with Accumulo, I am currently reading the docs on the apache web site.
Dave
 

    On Sunday, 27 May 2018, 21:25, Mike Thomsen <[hidden email]> wrote:
 

 Dave,

I don't know how far along Mark's code is, but you might find some useful
code you can borrow from the HBase commit(s) that added visibility label
support. For example, there is code for handling visibility labels w/
PutHBaseRecord that might be reusable if you want to create a put record
processor for Accumulo. It won't help you with the Accumulo APIs, but it
could provide useful strategies for how to identify and assign labels from
user input.

On Sun, May 27, 2018 at 3:14 PM DAVID SMITH
<[hidden email]> wrote:

> Mark
> Thanks for the link, I have downloaded the code from Github, it will be a
> good basis to start with.
> Many thanksDave
>
>
>    On Saturday, 26 May 2018, 20:22, Mark Payne <[hidden email]>
> wrote:
>
>
>  Hi Dave,
>
> I do have a branch in Github with the work that I had done:
> https://github.com/apache/nifi/tree/NIFI-818
> To be perfectly honest, though, I have absolutely no idea what state the
> code is in, if it's been tested, etc.
> But you're welcome to take it and run with it, if you'd like.
>
> Thanks
> -Mark
>
>
> On May 26, 2018, at 12:05 PM, davidrsmith <[hidden email]
> .INVALID<mailto:[hidden email]>> wrote:
>
> Hi
>
> A team at work has a need to interface with accumulo, has anyone tried
> this, I know a while ago Mark Payne raised nifi jira ticket 818 but as far
> as I am aware this was never completed.
> I would be grateful if anyone can help or point me in the direction of
> Mark's code that will give us a start.
>
> Many thanks
> Dave
>
>
>
>
> Sent from Samsung tablet
>
>
>


   
Reply | Threaded
Open this post in threaded view
|

Re: Accumulo processors

Mike Thomsen
If you do a PR, ping @joshelser. He works on HBase and Accumulo and
mentioned to me that he might be up for a code review on the Accumulo
processors.

On Sun, May 27, 2018 at 4:38 PM DAVID SMITH
<[hidden email]> wrote:

> Mike
> Thanks for the suggestion I will certainly have a look at the HBase code,
> I'm not really familiar with Accumulo, I am currently reading the docs on
> the apache web site.
> Dave
>
>
>     On Sunday, 27 May 2018, 21:25, Mike Thomsen <[hidden email]>
> wrote:
>
>
>  Dave,
>
> I don't know how far along Mark's code is, but you might find some useful
> code you can borrow from the HBase commit(s) that added visibility label
> support. For example, there is code for handling visibility labels w/
> PutHBaseRecord that might be reusable if you want to create a put record
> processor for Accumulo. It won't help you with the Accumulo APIs, but it
> could provide useful strategies for how to identify and assign labels from
> user input.
>
> On Sun, May 27, 2018 at 3:14 PM DAVID SMITH
> <[hidden email]> wrote:
>
> > Mark
> > Thanks for the link, I have downloaded the code from Github, it will be a
> > good basis to start with.
> > Many thanksDave
> >
> >
> >    On Saturday, 26 May 2018, 20:22, Mark Payne <[hidden email]>
> > wrote:
> >
> >
> >  Hi Dave,
> >
> > I do have a branch in Github with the work that I had done:
> > https://github.com/apache/nifi/tree/NIFI-818
> > To be perfectly honest, though, I have absolutely no idea what state the
> > code is in, if it's been tested, etc.
> > But you're welcome to take it and run with it, if you'd like.
> >
> > Thanks
> > -Mark
> >
> >
> > On May 26, 2018, at 12:05 PM, davidrsmith <[hidden email]
> > .INVALID<mailto:[hidden email]>> wrote:
> >
> > Hi
> >
> > A team at work has a need to interface with accumulo, has anyone tried
> > this, I know a while ago Mark Payne raised nifi jira ticket 818 but as
> far
> > as I am aware this was never completed.
> > I would be grateful if anyone can help or point me in the direction of
> > Mark's code that will give us a start.
> >
> > Many thanks
> > Dave
> >
> >
> >
> >
> > Sent from Samsung tablet
> >
> >
> >
>
>
>
Reply | Threaded
Open this post in threaded view
|

Re: Accumulo processors

davidrsmith
Ok, thanks I will keep that in mind when I (or whoever on my team writes these) get to a point where we would like a review before trying to submit back to the OS.
Dave
 

    On Sunday, 27 May 2018, 21:51, Mike Thomsen <[hidden email]> wrote:
 

 If you do a PR, ping @joshelser. He works on HBase and Accumulo and
mentioned to me that he might be up for a code review on the Accumulo
processors.

On Sun, May 27, 2018 at 4:38 PM DAVID SMITH
<[hidden email]> wrote:

> Mike
> Thanks for the suggestion I will certainly have a look at the HBase code,
> I'm not really familiar with Accumulo, I am currently reading the docs on
> the apache web site.
> Dave
>
>
>    On Sunday, 27 May 2018, 21:25, Mike Thomsen <[hidden email]>
> wrote:
>
>
>  Dave,
>
> I don't know how far along Mark's code is, but you might find some useful
> code you can borrow from the HBase commit(s) that added visibility label
> support. For example, there is code for handling visibility labels w/
> PutHBaseRecord that might be reusable if you want to create a put record
> processor for Accumulo. It won't help you with the Accumulo APIs, but it
> could provide useful strategies for how to identify and assign labels from
> user input.
>
> On Sun, May 27, 2018 at 3:14 PM DAVID SMITH
> <[hidden email]> wrote:
>
> > Mark
> > Thanks for the link, I have downloaded the code from Github, it will be a
> > good basis to start with.
> > Many thanksDave
> >
> >
> >    On Saturday, 26 May 2018, 20:22, Mark Payne <[hidden email]>
> > wrote:
> >
> >
> >  Hi Dave,
> >
> > I do have a branch in Github with the work that I had done:
> > https://github.com/apache/nifi/tree/NIFI-818
> > To be perfectly honest, though, I have absolutely no idea what state the
> > code is in, if it's been tested, etc.
> > But you're welcome to take it and run with it, if you'd like.
> >
> > Thanks
> > -Mark
> >
> >
> > On May 26, 2018, at 12:05 PM, davidrsmith <[hidden email]
> > .INVALID<mailto:[hidden email]>> wrote:
> >
> > Hi
> >
> > A team at work has a need to interface with accumulo, has anyone tried
> > this, I know a while ago Mark Payne raised nifi jira ticket 818 but as
> far
> > as I am aware this was never completed.
> > I would be grateful if anyone can help or point me in the direction of
> > Mark's code that will give us a start.
> >
> > Many thanks
> > Dave
> >
> >
> >
> >
> > Sent from Samsung tablet
> >
> >
> >
>
>
>


   
Reply | Threaded
Open this post in threaded view
|

Re: Accumulo processors

Mike Thomsen
David,

Any progress? Been flirting with the idea of looking at Accumulo and wanted
to sync up.

Thanks,

Mike

On Sun, May 27, 2018 at 5:06 PM DAVID SMITH
<[hidden email]> wrote:

> Ok, thanks I will keep that in mind when I (or whoever on my team writes
> these) get to a point where we would like a review before trying to submit
> back to the OS.
> Dave
>
>
>     On Sunday, 27 May 2018, 21:51, Mike Thomsen <[hidden email]>
> wrote:
>
>
>  If you do a PR, ping @joshelser. He works on HBase and Accumulo and
> mentioned to me that he might be up for a code review on the Accumulo
> processors.
>
> On Sun, May 27, 2018 at 4:38 PM DAVID SMITH
> <[hidden email]> wrote:
>
> > Mike
> > Thanks for the suggestion I will certainly have a look at the HBase code,
> > I'm not really familiar with Accumulo, I am currently reading the docs on
> > the apache web site.
> > Dave
> >
> >
> >    On Sunday, 27 May 2018, 21:25, Mike Thomsen <[hidden email]>
> > wrote:
> >
> >
> >  Dave,
> >
> > I don't know how far along Mark's code is, but you might find some useful
> > code you can borrow from the HBase commit(s) that added visibility label
> > support. For example, there is code for handling visibility labels w/
> > PutHBaseRecord that might be reusable if you want to create a put record
> > processor for Accumulo. It won't help you with the Accumulo APIs, but it
> > could provide useful strategies for how to identify and assign labels
> from
> > user input.
> >
> > On Sun, May 27, 2018 at 3:14 PM DAVID SMITH
> > <[hidden email]> wrote:
> >
> > > Mark
> > > Thanks for the link, I have downloaded the code from Github, it will
> be a
> > > good basis to start with.
> > > Many thanksDave
> > >
> > >
> > >    On Saturday, 26 May 2018, 20:22, Mark Payne <[hidden email]>
> > > wrote:
> > >
> > >
> > >  Hi Dave,
> > >
> > > I do have a branch in Github with the work that I had done:
> > > https://github.com/apache/nifi/tree/NIFI-818
> > > To be perfectly honest, though, I have absolutely no idea what state
> the
> > > code is in, if it's been tested, etc.
> > > But you're welcome to take it and run with it, if you'd like.
> > >
> > > Thanks
> > > -Mark
> > >
> > >
> > > On May 26, 2018, at 12:05 PM, davidrsmith <[hidden email]
> > > .INVALID<mailto:[hidden email]>> wrote:
> > >
> > > Hi
> > >
> > > A team at work has a need to interface with accumulo, has anyone tried
> > > this, I know a while ago Mark Payne raised nifi jira ticket 818 but as
> > far
> > > as I am aware this was never completed.
> > > I would be grateful if anyone can help or point me in the direction of
> > > Mark's code that will give us a start.
> > >
> > > Many thanks
> > > Dave
> > >
> > >
> > >
> > >
> > > Sent from Samsung tablet
> > >
> > >
> > >
> >
> >
> >
>
>
>
Reply | Threaded
Open this post in threaded view
|

Re: Accumulo processors

Marc Parisi
Hey Mike,
  I recall looking at Mark's PR in May and believe it fit the bill for a
processor looking to live ingest data into Accumulo. Just be cautious of
minor and major compaction storms that may cause backpressure. Bulk load is
also possible -- and probably even easier with the record-oriented paradigm
-- but this is a great step.

   If you have a relatively large batch size and want to reduce the number
of mutations ( keep in mind internally the batch writer will make a copy of
that mutation ), there may be some benefit in reusing *mutation* [1] if
those puts are all within the same or a small set of rows within a table. I
don't suspect this will be a limiting factor for most, so it probably
depends on your situation.

[1]
https://github.com/apache/nifi/commit/114c8578e097c338b44d31e38454b199f2bb2660#diff-b68a6e83fae66bb7f1617ade75661fa4R243



On Mon, Sep 17, 2018 at 3:43 PM Mike Thomsen <[hidden email]> wrote:

> David,
>
> Any progress? Been flirting with the idea of looking at Accumulo and wanted
> to sync up.
>
> Thanks,
>
> Mike
>
> On Sun, May 27, 2018 at 5:06 PM DAVID SMITH
> <[hidden email]> wrote:
>
> > Ok, thanks I will keep that in mind when I (or whoever on my team writes
> > these) get to a point where we would like a review before trying to
> submit
> > back to the OS.
> > Dave
> >
> >
> >     On Sunday, 27 May 2018, 21:51, Mike Thomsen <[hidden email]>
> > wrote:
> >
> >
> >  If you do a PR, ping @joshelser. He works on HBase and Accumulo and
> > mentioned to me that he might be up for a code review on the Accumulo
> > processors.
> >
> > On Sun, May 27, 2018 at 4:38 PM DAVID SMITH
> > <[hidden email]> wrote:
> >
> > > Mike
> > > Thanks for the suggestion I will certainly have a look at the HBase
> code,
> > > I'm not really familiar with Accumulo, I am currently reading the docs
> on
> > > the apache web site.
> > > Dave
> > >
> > >
> > >    On Sunday, 27 May 2018, 21:25, Mike Thomsen <[hidden email]
> >
> > > wrote:
> > >
> > >
> > >  Dave,
> > >
> > > I don't know how far along Mark's code is, but you might find some
> useful
> > > code you can borrow from the HBase commit(s) that added visibility
> label
> > > support. For example, there is code for handling visibility labels w/
> > > PutHBaseRecord that might be reusable if you want to create a put
> record
> > > processor for Accumulo. It won't help you with the Accumulo APIs, but
> it
> > > could provide useful strategies for how to identify and assign labels
> > from
> > > user input.
> > >
> > > On Sun, May 27, 2018 at 3:14 PM DAVID SMITH
> > > <[hidden email]> wrote:
> > >
> > > > Mark
> > > > Thanks for the link, I have downloaded the code from Github, it will
> > be a
> > > > good basis to start with.
> > > > Many thanksDave
> > > >
> > > >
> > > >    On Saturday, 26 May 2018, 20:22, Mark Payne <[hidden email]
> >
> > > > wrote:
> > > >
> > > >
> > > >  Hi Dave,
> > > >
> > > > I do have a branch in Github with the work that I had done:
> > > > https://github.com/apache/nifi/tree/NIFI-818
> > > > To be perfectly honest, though, I have absolutely no idea what state
> > the
> > > > code is in, if it's been tested, etc.
> > > > But you're welcome to take it and run with it, if you'd like.
> > > >
> > > > Thanks
> > > > -Mark
> > > >
> > > >
> > > > On May 26, 2018, at 12:05 PM, davidrsmith <
> [hidden email]
> > > > .INVALID<mailto:[hidden email]>> wrote:
> > > >
> > > > Hi
> > > >
> > > > A team at work has a need to interface with accumulo, has anyone
> tried
> > > > this, I know a while ago Mark Payne raised nifi jira ticket 818 but
> as
> > > far
> > > > as I am aware this was never completed.
> > > > I would be grateful if anyone can help or point me in the direction
> of
> > > > Mark's code that will give us a start.
> > > >
> > > > Many thanks
> > > > Dave
> > > >
> > > >
> > > >
> > > >
> > > > Sent from Samsung tablet
> > > >
> > > >
> > > >
> > >
> > >
> > >
> >
> >
> >
>