Running processors simulaneously

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Running processors simulaneously

Vdsa
I have a custom processor which generates some files, listFile takes those files and moves those files to HDFS, when I run a processor group it starts executing all the processors and as list file is independent of any input it starts fetching file and it does not get that. Thus i get error that file does not exist. How do I wait processor till the time i get files?
Reply | Threaded
Open this post in threaded view
|

Re: Running processors simulaneously

Matt Burgess-2
It sounds like your custom processor is generating files on disk, does
it output any flow files, perhaps one flow file per generated on-disk
file? If for each file generated on disk, the custom processor could
output a flow file with the "absolute.path" and "filename" attributes
set, that kind of emulates a ListFile processor, so after your custom
processor you could go directly to a FetchFile (or FetchHDFS or
whatever) processor.  That keeps the flow serial rather than ListFile
running concurrently with your processor generating files.  If I've
misunderstood anything please let me know.

Regards,
Matt

On Tue, May 9, 2017 at 7:20 AM, Vdsa <[hidden email]> wrote:

> I have a custom processor which generates some files, listFile takes those
> files and moves those files to HDFS, when I run a processor group it starts
> executing all the processors and as list file is independent of any input it
> starts fetching file and it does not get that. Thus i get error that file
> does not exist. How do I wait processor till the time i get files?
>
>
>
> --
> View this message in context: http://apache-nifi-developer-list.39713.n7.nabble.com/Running-processors-simulaneously-tp15774.html
> Sent from the Apache NiFi Developer List mailing list archive at Nabble.com.
Reply | Threaded
Open this post in threaded view
|

RE: Running processors simulaneously

peter-ho
One potential option is to generate the files as hidden, and rename the file not to be hidden when the file is finished, so it can be picked up.

Thanks
Peter

From: Matt Burgess<mailto:[hidden email]>
Sent: Tuesday, May 9, 2017 10:46 AM
To: [hidden email]<mailto:[hidden email]>
Subject: Re: Running processors simulaneously

It sounds like your custom processor is generating files on disk, does
it output any flow files, perhaps one flow file per generated on-disk
file? If for each file generated on disk, the custom processor could
output a flow file with the "absolute.path" and "filename" attributes
set, that kind of emulates a ListFile processor, so after your custom
processor you could go directly to a FetchFile (or FetchHDFS or
whatever) processor.  That keeps the flow serial rather than ListFile
running concurrently with your processor generating files.  If I've
misunderstood anything please let me know.

Regards,
Matt

On Tue, May 9, 2017 at 7:20 AM, Vdsa <[hidden email]> wrote:

> I have a custom processor which generates some files, listFile takes those
> files and moves those files to HDFS, when I run a processor group it starts
> executing all the processors and as list file is independent of any input it
> starts fetching file and it does not get that. Thus i get error that file
> does not exist. How do I wait processor till the time i get files?
>
>
>
> --
> View this message in context: http://apache-nifi-developer-list.39713.n7.nabble.com/Running-processors-simulaneously-tp15774.html
> Sent from the Apache NiFi Developer List mailing list archive at Nabble.com.

Reply | Threaded
Open this post in threaded view
|

Re: Running processors simulaneously

Vdsa
In reply to this post by Matt Burgess-2
Thanks matt for the reply, this solution seems interesting, right now only files are generated on disk, is it possible to generate flow file for each file? if yes how can i achieve that.
Thanks in advance.
Reply | Threaded
Open this post in threaded view
|

Re: Running processors simulaneously

Vdsa
I have implemented solution of creation flow files and working perfectly only problem I am facing is flow files are transferred after completion of processor or on commit of session. Is there any way i send flow files as it is generated to the next processor if i commit session after a flowfile.Transfer() it gives IllegalStateException.

How can i transfer one one  file or in batch to next processor.

Thanks.