Executing a python script with Execute Stream Command

classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

Executing a python script with Execute Stream Command

Stephen Pietrasko
All,

I am trying to configure the Execute Stream Command processor to execute a
python script and have the output send to a queue with PutJMS.

I'm having a bit of difficulty though. I've been looking at this previous
email chain which is similar to my issue.
https://www.mail-archive.com/dev@.../msg01578.html

The script runs and sends the output to sys.stdout.write but when I try and
have NiFi run the script I see no bytes in or out which means nothing is
passed to the queue.

Would this be an issue with the output being sent to stdout or a property
issue with ExecuteStreamCommand.

I have tried several configurations of the property fields. This is my
general understanding of each field and what they should be:

Command Argument: name of script and arguments
Command Path: python
Working Directory: Directory where script is located.

Any help would be greatly appreciated.

--
V/R

Stephen M. Pietrasko
Security Engineer
Reply | Threaded
Open this post in threaded view
|

RE: Executing a python script with Execute Stream Command

Mark Payne
Stephen,

Your understanding of the properties seem correct. Can you provide the properties that you're using to configure the processor?

Thanks
-mark

----------------------------------------

> Date: Thu, 4 Jun 2015 09:51:46 -0400
> Subject: Executing a python script with Execute Stream Command
> From: [hidden email]
> To: [hidden email]; [hidden email]
>
> All,
>
> I am trying to configure the Execute Stream Command processor to execute a
> python script and have the output send to a queue with PutJMS.
>
> I'm having a bit of difficulty though. I've been looking at this previous
> email chain which is similar to my issue.
> https://www.mail-archive.com/dev@.../msg01578.html
>
> The script runs and sends the output to sys.stdout.write but when I try and
> have NiFi run the script I see no bytes in or out which means nothing is
> passed to the queue.
>
> Would this be an issue with the output being sent to stdout or a property
> issue with ExecuteStreamCommand.
>
> I have tried several configurations of the property fields. This is my
> general understanding of each field and what they should be:
>
> Command Argument: name of script and arguments
> Command Path: python
> Working Directory: Directory where script is located.
>
> Any help would be greatly appreciated.
>
> --
> V/R
>
> Stephen M. Pietrasko
> Security Engineer
     
Reply | Threaded
Open this post in threaded view
|

Re: Executing a python script with Execute Stream Command

Stephen Pietrasko
Mark,

The properties I am using are as follows:

Command Argument: nameofscript.py -j multine
Command Path: python
Working Directory /opt/dev/


Jimmy,

Not exactly sure what you are asking with your question "Does the python
script that you run from NiFi have a select set of Python packages you can
leverage in your python script.  Is it at all possible to add additional
python packages?"

Here is a sanitized version of the script. Are you asking if I can import
more packages in my script? If so, yes, I can do that.

http://pastebin.com/peSCkx6j


Thank you guys.

-Steve


On Thu, Jun 4, 2015 at 9:57 AM, Mark Payne <[hidden email]> wrote:

> Stephen,
>
> Your understanding of the properties seem correct. Can you provide the
> properties that you're using to configure the processor?
>
> Thanks
> -mark
>
> ----------------------------------------
> > Date: Thu, 4 Jun 2015 09:51:46 -0400
> > Subject: Executing a python script with Execute Stream Command
> > From: [hidden email]
> > To: [hidden email]; [hidden email]
> >
> > All,
> >
> > I am trying to configure the Execute Stream Command processor to execute
> a
> > python script and have the output send to a queue with PutJMS.
> >
> > I'm having a bit of difficulty though. I've been looking at this previous
> > email chain which is similar to my issue.
> > https://www.mail-archive.com/dev@.../msg01578.html
> >
> > The script runs and sends the output to sys.stdout.write but when I try
> and
> > have NiFi run the script I see no bytes in or out which means nothing is
> > passed to the queue.
> >
> > Would this be an issue with the output being sent to stdout or a property
> > issue with ExecuteStreamCommand.
> >
> > I have tried several configurations of the property fields. This is my
> > general understanding of each field and what they should be:
> >
> > Command Argument: name of script and arguments
> > Command Path: python
> > Working Directory: Directory where script is located.
> >
> > Any help would be greatly appreciated.
> >
> > --
> > V/R
> >
> > Stephen M. Pietrasko
> > Security Engineer
>
>
Reply | Threaded
Open this post in threaded view
|

RE: Executing a python script with Execute Stream Command

Mark Payne
Stephen,

The "Command Argument" property expects the arguments to be delimited by semi-colons, rather than spaces. 

Try changing that property to "nameofscript.py;-j;multiline" and see if that works for you.

Thanks
-Mark

----------------------------------------

> Date: Thu, 4 Jun 2015 12:34:26 -0400
> Subject: Re: Executing a python script with Execute Stream Command
> From: [hidden email]
> To: [hidden email]
>
> Mark,
>
> The properties I am using are as follows:
>
> Command Argument: nameofscript.py -j multine
> Command Path: python
> Working Directory /opt/dev/
>
>
> Jimmy,
>
> Not exactly sure what you are asking with your question "Does the python
> script that you run from NiFi have a select set of Python packages you can
> leverage in your python script. Is it at all possible to add additional
> python packages?"
>
> Here is a sanitized version of the script. Are you asking if I can import
> more packages in my script? If so, yes, I can do that.
>
> http://pastebin.com/peSCkx6j
>
>
> Thank you guys.
>
> -Steve
>
>
> On Thu, Jun 4, 2015 at 9:57 AM, Mark Payne <[hidden email]> wrote:
>
>> Stephen,
>>
>> Your understanding of the properties seem correct. Can you provide the
>> properties that you're using to configure the processor?
>>
>> Thanks
>> -mark
>>
>> ----------------------------------------
>>> Date: Thu, 4 Jun 2015 09:51:46 -0400
>>> Subject: Executing a python script with Execute Stream Command
>>> From: [hidden email]
>>> To: [hidden email]; [hidden email]
>>>
>>> All,
>>>
>>> I am trying to configure the Execute Stream Command processor to execute
>> a
>>> python script and have the output send to a queue with PutJMS.
>>>
>>> I'm having a bit of difficulty though. I've been looking at this previous
>>> email chain which is similar to my issue.
>>> https://www.mail-archive.com/dev@.../msg01578.html
>>>
>>> The script runs and sends the output to sys.stdout.write but when I try
>> and
>>> have NiFi run the script I see no bytes in or out which means nothing is
>>> passed to the queue.
>>>
>>> Would this be an issue with the output being sent to stdout or a property
>>> issue with ExecuteStreamCommand.
>>>
>>> I have tried several configurations of the property fields. This is my
>>> general understanding of each field and what they should be:
>>>
>>> Command Argument: name of script and arguments
>>> Command Path: python
>>> Working Directory: Directory where script is located.
>>>
>>> Any help would be greatly appreciated.
>>>
>>> --
>>> V/R
>>>
>>> Stephen M. Pietrasko
>>> Security Engineer
>>
>>
     
Reply | Threaded
Open this post in threaded view
|

Re: Executing a python script with Execute Stream Command

Stephen Pietrasko
Mark,

Unfortunately that did not work. The Tasks/Time keep increasing but nothing
else.

Thanks,
Steve

On Thu, Jun 4, 2015 at 12:37 PM, Mark Payne <[hidden email]> wrote:

> Stephen,
>
> The "Command Argument" property expects the arguments to be delimited by
> semi-colons, rather than spaces.
>
> Try changing that property to "nameofscript.py;-j;multiline" and see if
> that works for you.
>
> Thanks
> -Mark
>
> ----------------------------------------
> > Date: Thu, 4 Jun 2015 12:34:26 -0400
> > Subject: Re: Executing a python script with Execute Stream Command
> > From: [hidden email]
> > To: [hidden email]
> >
> > Mark,
> >
> > The properties I am using are as follows:
> >
> > Command Argument: nameofscript.py -j multine
> > Command Path: python
> > Working Directory /opt/dev/
> >
> >
> > Jimmy,
> >
> > Not exactly sure what you are asking with your question "Does the python
> > script that you run from NiFi have a select set of Python packages you
> can
> > leverage in your python script. Is it at all possible to add additional
> > python packages?"
> >
> > Here is a sanitized version of the script. Are you asking if I can import
> > more packages in my script? If so, yes, I can do that.
> >
> > http://pastebin.com/peSCkx6j
> >
> >
> > Thank you guys.
> >
> > -Steve
> >
> >
> > On Thu, Jun 4, 2015 at 9:57 AM, Mark Payne <[hidden email]> wrote:
> >
> >> Stephen,
> >>
> >> Your understanding of the properties seem correct. Can you provide the
> >> properties that you're using to configure the processor?
> >>
> >> Thanks
> >> -mark
> >>
> >> ----------------------------------------
> >>> Date: Thu, 4 Jun 2015 09:51:46 -0400
> >>> Subject: Executing a python script with Execute Stream Command
> >>> From: [hidden email]
> >>> To: [hidden email]; [hidden email]
> >>>
> >>> All,
> >>>
> >>> I am trying to configure the Execute Stream Command processor to
> execute
> >> a
> >>> python script and have the output send to a queue with PutJMS.
> >>>
> >>> I'm having a bit of difficulty though. I've been looking at this
> previous
> >>> email chain which is similar to my issue.
> >>>
> https://www.mail-archive.com/dev@.../msg01578.html
> >>>
> >>> The script runs and sends the output to sys.stdout.write but when I try
> >> and
> >>> have NiFi run the script I see no bytes in or out which means nothing
> is
> >>> passed to the queue.
> >>>
> >>> Would this be an issue with the output being sent to stdout or a
> property
> >>> issue with ExecuteStreamCommand.
> >>>
> >>> I have tried several configurations of the property fields. This is my
> >>> general understanding of each field and what they should be:
> >>>
> >>> Command Argument: name of script and arguments
> >>> Command Path: python
> >>> Working Directory: Directory where script is located.
> >>>
> >>> Any help would be greatly appreciated.
> >>>
> >>> --
> >>> V/R
> >>>
> >>> Stephen M. Pietrasko
> >>> Security Engineer
> >>
> >>
>
>
Reply | Threaded
Open this post in threaded view
|

Re: Executing a python script with Execute Stream Command

Aldrin Piri
Steve,

I was able to mock up a flow myself and can provide a template to share
with you that acts as I would anticipate.  All of this is coming with the
heavy caveat that I am not a Python master by any means.

Before that, however, reading through the history, can you clarify if you
are providing any input to the processor?  Based on the context and noted
behavior of the tasks/time increasing, my suspicion is that you are not,
and the intent of the processor is not aligning with your expectations of
this processor acting as a means of ingest into the flow.  To that end, the
intent of the ExecuteStreamProcessor as designed is to "... execute[s] an
external command on the contents of a flow file, and create[s] a new flow
file with the results of the command."  Accordingly, if there is no input,
the processor just returns after being allotted an execution cycle.

I believe you may be after the ExecuteProcess processor which could be
adapted to carry out execution without the need for input.

Let us know if that is the case, if not, any additional clues will help us
get to the issue.

Thanks!

--aldrin

On Thu, Jun 4, 2015 at 12:51 PM, Stephen Pietrasko <
[hidden email]> wrote:

> Mark,
>
> Unfortunately that did not work. The Tasks/Time keep increasing but nothing
> else.
>
> Thanks,
> Steve
>
> On Thu, Jun 4, 2015 at 12:37 PM, Mark Payne <[hidden email]> wrote:
>
> > Stephen,
> >
> > The "Command Argument" property expects the arguments to be delimited by
> > semi-colons, rather than spaces.
> >
> > Try changing that property to "nameofscript.py;-j;multiline" and see if
> > that works for you.
> >
> > Thanks
> > -Mark
> >
> > ----------------------------------------
> > > Date: Thu, 4 Jun 2015 12:34:26 -0400
> > > Subject: Re: Executing a python script with Execute Stream Command
> > > From: [hidden email]
> > > To: [hidden email]
> > >
> > > Mark,
> > >
> > > The properties I am using are as follows:
> > >
> > > Command Argument: nameofscript.py -j multine
> > > Command Path: python
> > > Working Directory /opt/dev/
> > >
> > >
> > > Jimmy,
> > >
> > > Not exactly sure what you are asking with your question "Does the
> python
> > > script that you run from NiFi have a select set of Python packages you
> > can
> > > leverage in your python script. Is it at all possible to add additional
> > > python packages?"
> > >
> > > Here is a sanitized version of the script. Are you asking if I can
> import
> > > more packages in my script? If so, yes, I can do that.
> > >
> > > http://pastebin.com/peSCkx6j
> > >
> > >
> > > Thank you guys.
> > >
> > > -Steve
> > >
> > >
> > > On Thu, Jun 4, 2015 at 9:57 AM, Mark Payne <[hidden email]>
> wrote:
> > >
> > >> Stephen,
> > >>
> > >> Your understanding of the properties seem correct. Can you provide the
> > >> properties that you're using to configure the processor?
> > >>
> > >> Thanks
> > >> -mark
> > >>
> > >> ----------------------------------------
> > >>> Date: Thu, 4 Jun 2015 09:51:46 -0400
> > >>> Subject: Executing a python script with Execute Stream Command
> > >>> From: [hidden email]
> > >>> To: [hidden email]; [hidden email]
> > >>>
> > >>> All,
> > >>>
> > >>> I am trying to configure the Execute Stream Command processor to
> > execute
> > >> a
> > >>> python script and have the output send to a queue with PutJMS.
> > >>>
> > >>> I'm having a bit of difficulty though. I've been looking at this
> > previous
> > >>> email chain which is similar to my issue.
> > >>>
> > https://www.mail-archive.com/dev@.../msg01578.html
> > >>>
> > >>> The script runs and sends the output to sys.stdout.write but when I
> try
> > >> and
> > >>> have NiFi run the script I see no bytes in or out which means nothing
> > is
> > >>> passed to the queue.
> > >>>
> > >>> Would this be an issue with the output being sent to stdout or a
> > property
> > >>> issue with ExecuteStreamCommand.
> > >>>
> > >>> I have tried several configurations of the property fields. This is
> my
> > >>> general understanding of each field and what they should be:
> > >>>
> > >>> Command Argument: name of script and arguments
> > >>> Command Path: python
> > >>> Working Directory: Directory where script is located.
> > >>>
> > >>> Any help would be greatly appreciated.
> > >>>
> > >>> --
> > >>> V/R
> > >>>
> > >>> Stephen M. Pietrasko
> > >>> Security Engineer
> > >>
> > >>
> >
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: Executing a python script with Execute Stream Command

Stephen Pietrasko
Aldrin,

I want to thank you for your help. ExecuteProcess was the solution to my
problem.

Thanks for everyone that helped.

-Steve

On Thu, Jun 4, 2015 at 2:06 PM, Aldrin Piri <[hidden email]> wrote:

> Steve,
>
> I was able to mock up a flow myself and can provide a template to share
> with you that acts as I would anticipate.  All of this is coming with the
> heavy caveat that I am not a Python master by any means.
>
> Before that, however, reading through the history, can you clarify if you
> are providing any input to the processor?  Based on the context and noted
> behavior of the tasks/time increasing, my suspicion is that you are not,
> and the intent of the processor is not aligning with your expectations of
> this processor acting as a means of ingest into the flow.  To that end, the
> intent of the ExecuteStreamProcessor as designed is to "... execute[s] an
> external command on the contents of a flow file, and create[s] a new flow
> file with the results of the command."  Accordingly, if there is no input,
> the processor just returns after being allotted an execution cycle.
>
> I believe you may be after the ExecuteProcess processor which could be
> adapted to carry out execution without the need for input.
>
> Let us know if that is the case, if not, any additional clues will help us
> get to the issue.
>
> Thanks!
>
> --aldrin
>
> On Thu, Jun 4, 2015 at 12:51 PM, Stephen Pietrasko <
> [hidden email]> wrote:
>
> > Mark,
> >
> > Unfortunately that did not work. The Tasks/Time keep increasing but
> nothing
> > else.
> >
> > Thanks,
> > Steve
> >
> > On Thu, Jun 4, 2015 at 12:37 PM, Mark Payne <[hidden email]>
> wrote:
> >
> > > Stephen,
> > >
> > > The "Command Argument" property expects the arguments to be delimited
> by
> > > semi-colons, rather than spaces.
> > >
> > > Try changing that property to "nameofscript.py;-j;multiline" and see if
> > > that works for you.
> > >
> > > Thanks
> > > -Mark
> > >
> > > ----------------------------------------
> > > > Date: Thu, 4 Jun 2015 12:34:26 -0400
> > > > Subject: Re: Executing a python script with Execute Stream Command
> > > > From: [hidden email]
> > > > To: [hidden email]
> > > >
> > > > Mark,
> > > >
> > > > The properties I am using are as follows:
> > > >
> > > > Command Argument: nameofscript.py -j multine
> > > > Command Path: python
> > > > Working Directory /opt/dev/
> > > >
> > > >
> > > > Jimmy,
> > > >
> > > > Not exactly sure what you are asking with your question "Does the
> > python
> > > > script that you run from NiFi have a select set of Python packages
> you
> > > can
> > > > leverage in your python script. Is it at all possible to add
> additional
> > > > python packages?"
> > > >
> > > > Here is a sanitized version of the script. Are you asking if I can
> > import
> > > > more packages in my script? If so, yes, I can do that.
> > > >
> > > > http://pastebin.com/peSCkx6j
> > > >
> > > >
> > > > Thank you guys.
> > > >
> > > > -Steve
> > > >
> > > >
> > > > On Thu, Jun 4, 2015 at 9:57 AM, Mark Payne <[hidden email]>
> > wrote:
> > > >
> > > >> Stephen,
> > > >>
> > > >> Your understanding of the properties seem correct. Can you provide
> the
> > > >> properties that you're using to configure the processor?
> > > >>
> > > >> Thanks
> > > >> -mark
> > > >>
> > > >> ----------------------------------------
> > > >>> Date: Thu, 4 Jun 2015 09:51:46 -0400
> > > >>> Subject: Executing a python script with Execute Stream Command
> > > >>> From: [hidden email]
> > > >>> To: [hidden email]; [hidden email]
> > > >>>
> > > >>> All,
> > > >>>
> > > >>> I am trying to configure the Execute Stream Command processor to
> > > execute
> > > >> a
> > > >>> python script and have the output send to a queue with PutJMS.
> > > >>>
> > > >>> I'm having a bit of difficulty though. I've been looking at this
> > > previous
> > > >>> email chain which is similar to my issue.
> > > >>>
> > >
> https://www.mail-archive.com/dev@.../msg01578.html
> > > >>>
> > > >>> The script runs and sends the output to sys.stdout.write but when I
> > try
> > > >> and
> > > >>> have NiFi run the script I see no bytes in or out which means
> nothing
> > > is
> > > >>> passed to the queue.
> > > >>>
> > > >>> Would this be an issue with the output being sent to stdout or a
> > > property
> > > >>> issue with ExecuteStreamCommand.
> > > >>>
> > > >>> I have tried several configurations of the property fields. This is
> > my
> > > >>> general understanding of each field and what they should be:
> > > >>>
> > > >>> Command Argument: name of script and arguments
> > > >>> Command Path: python
> > > >>> Working Directory: Directory where script is located.
> > > >>>
> > > >>> Any help would be greatly appreciated.
> > > >>>
> > > >>> --
> > > >>> V/R
> > > >>>
> > > >>> Stephen M. Pietrasko
> > > >>> Security Engineer
> > > >>
> > > >>
> > >
> > >
> >
>



--
V/R

Stephen M. Pietrasko
Security Engineer
G2-Inc
301-575-5142
www.g2-inc.com
Reply | Threaded
Open this post in threaded view
|

RE: Executing a python script with Execute Stream Command

Mark Payne
Steve,

That's great that you guys have gotten this resolved.

Aldrin,

Great call & great work getting that stuff settled.

All,

I think this is a very important usability problem - I'm sure there are plenty of other people who will run
into similar issues. I think we need to add something to the API that allows the developer of a Processor
to indicate that the Processor fits into 1 of 3 categories:

A) Does not expect incoming FlowFiles (UI should not allow you to even create a connection to the Processor; 
if one exists already, the processor should become invalid)

B) Processor does expect incoming FlowFiles (Processor should become invalid until it has an incoming connection,
just like it does if its outgoing connections are not all satisfied)

C) Processor can take incoming FlowFiles but doesn't require them. I don't know that we have this
use case in any of our Processors, but it is a valid use case, I think. In this case, the API needs to provide
information to the Processor (via the ProcessContext) about whether or not it has any incoming connections.
I believe I may have already created a ticket for this, but not sure.

Does anybody have any thoughts on this?


----------------------------------------

> Date: Fri, 5 Jun 2015 08:04:55 -0400
> Subject: Re: Executing a python script with Execute Stream Command
> From: [hidden email]
> To: [hidden email]
>
> Aldrin,
>
> I want to thank you for your help. ExecuteProcess was the solution to my
> problem.
>
> Thanks for everyone that helped.
>
> -Steve
>
> On Thu, Jun 4, 2015 at 2:06 PM, Aldrin Piri <[hidden email]> wrote:
>
>> Steve,
>>
>> I was able to mock up a flow myself and can provide a template to share
>> with you that acts as I would anticipate. All of this is coming with the
>> heavy caveat that I am not a Python master by any means.
>>
>> Before that, however, reading through the history, can you clarify if you
>> are providing any input to the processor? Based on the context and noted
>> behavior of the tasks/time increasing, my suspicion is that you are not,
>> and the intent of the processor is not aligning with your expectations of
>> this processor acting as a means of ingest into the flow. To that end, the
>> intent of the ExecuteStreamProcessor as designed is to "... execute[s] an
>> external command on the contents of a flow file, and create[s] a new flow
>> file with the results of the command." Accordingly, if there is no input,
>> the processor just returns after being allotted an execution cycle.
>>
>> I believe you may be after the ExecuteProcess processor which could be
>> adapted to carry out execution without the need for input.
>>
>> Let us know if that is the case, if not, any additional clues will help us
>> get to the issue.
>>
>> Thanks!
>>
>> --aldrin
>>
>> On Thu, Jun 4, 2015 at 12:51 PM, Stephen Pietrasko <
>> [hidden email]> wrote:
>>
>>> Mark,
>>>
>>> Unfortunately that did not work. The Tasks/Time keep increasing but
>> nothing
>>> else.
>>>
>>> Thanks,
>>> Steve
>>>
>>> On Thu, Jun 4, 2015 at 12:37 PM, Mark Payne <[hidden email]>
>> wrote:
>>>
>>>> Stephen,
>>>>
>>>> The "Command Argument" property expects the arguments to be delimited
>> by
>>>> semi-colons, rather than spaces.
>>>>
>>>> Try changing that property to "nameofscript.py;-j;multiline" and see if
>>>> that works for you.
>>>>
>>>> Thanks
>>>> -Mark
>>>>
>>>> ----------------------------------------
>>>>> Date: Thu, 4 Jun 2015 12:34:26 -0400
>>>>> Subject: Re: Executing a python script with Execute Stream Command
>>>>> From: [hidden email]
>>>>> To: [hidden email]
>>>>>
>>>>> Mark,
>>>>>
>>>>> The properties I am using are as follows:
>>>>>
>>>>> Command Argument: nameofscript.py -j multine
>>>>> Command Path: python
>>>>> Working Directory /opt/dev/
>>>>>
>>>>>
>>>>> Jimmy,
>>>>>
>>>>> Not exactly sure what you are asking with your question "Does the
>>> python
>>>>> script that you run from NiFi have a select set of Python packages
>> you
>>>> can
>>>>> leverage in your python script. Is it at all possible to add
>> additional
>>>>> python packages?"
>>>>>
>>>>> Here is a sanitized version of the script. Are you asking if I can
>>> import
>>>>> more packages in my script? If so, yes, I can do that.
>>>>>
>>>>> http://pastebin.com/peSCkx6j
>>>>>
>>>>>
>>>>> Thank you guys.
>>>>>
>>>>> -Steve
>>>>>
>>>>>
>>>>> On Thu, Jun 4, 2015 at 9:57 AM, Mark Payne <[hidden email]>
>>> wrote:
>>>>>
>>>>>> Stephen,
>>>>>>
>>>>>> Your understanding of the properties seem correct. Can you provide
>> the
>>>>>> properties that you're using to configure the processor?
>>>>>>
>>>>>> Thanks
>>>>>> -mark
>>>>>>
>>>>>> ----------------------------------------
>>>>>>> Date: Thu, 4 Jun 2015 09:51:46 -0400
>>>>>>> Subject: Executing a python script with Execute Stream Command
>>>>>>> From: [hidden email]
>>>>>>> To: [hidden email]; [hidden email]
>>>>>>>
>>>>>>> All,
>>>>>>>
>>>>>>> I am trying to configure the Execute Stream Command processor to
>>>> execute
>>>>>> a
>>>>>>> python script and have the output send to a queue with PutJMS.
>>>>>>>
>>>>>>> I'm having a bit of difficulty though. I've been looking at this
>>>> previous
>>>>>>> email chain which is similar to my issue.
>>>>>>>
>>>>
>> https://www.mail-archive.com/dev@.../msg01578.html
>>>>>>>
>>>>>>> The script runs and sends the output to sys.stdout.write but when I
>>> try
>>>>>> and
>>>>>>> have NiFi run the script I see no bytes in or out which means
>> nothing
>>>> is
>>>>>>> passed to the queue.
>>>>>>>
>>>>>>> Would this be an issue with the output being sent to stdout or a
>>>> property
>>>>>>> issue with ExecuteStreamCommand.
>>>>>>>
>>>>>>> I have tried several configurations of the property fields. This is
>>> my
>>>>>>> general understanding of each field and what they should be:
>>>>>>>
>>>>>>> Command Argument: name of script and arguments
>>>>>>> Command Path: python
>>>>>>> Working Directory: Directory where script is located.
>>>>>>>
>>>>>>> Any help would be greatly appreciated.
>>>>>>>
>>>>>>> --
>>>>>>> V/R
>>>>>>>
>>>>>>> Stephen M. Pietrasko
>>>>>>> Security Engineer
>>>>>>
>>>>>>
>>>>
>>>>
>>>
>>
>
>
>
> --
> V/R
>
> Stephen M. Pietrasko
> Security Engineer
> G2-Inc
> 301-575-5142
> www.g2-inc.com
     
Reply | Threaded
Open this post in threaded view
|

Re: Executing a python script with Execute Stream Command

Joe Witt
This is in the category of 'what sorts of things can and should we do
to provide type safety of flow configuration all the way to the user
at runtime'.

At the NYC meetup someone asked a similar question which was 'will the
flow let me configure processors to talk to eachother that don't make
sense'.  The cases you outline above seem quite doable with an
annotation.  We could extend this concept to include consideration of
attributes on flowfiles that a given processor requires.  I think it
would be limited in terms of being able to prevent the user from
making a flow that doesn't make sense because we don't know the
attributes of a flowfile until they're flowing.  However, we could
create automatic dead-letter queue type behavior if a flowfile ends up
on a connection for which the consuming processes do not accept.
@RequiresAttribute('mime.type','application/json') for instance is
something a processor could indicate and if it receives a flowfile
that doesn't have that attribute name and value the framework would
ensure it didn't get picked up by that process.

...there is a lot we can do here.

On Fri, Jun 5, 2015 at 8:22 AM, Mark Payne <[hidden email]> wrote:

> Steve,
>
> That's great that you guys have gotten this resolved.
>
> Aldrin,
>
> Great call & great work getting that stuff settled.
>
> All,
>
> I think this is a very important usability problem - I'm sure there are plenty of other people who will run
> into similar issues. I think we need to add something to the API that allows the developer of a Processor
> to indicate that the Processor fits into 1 of 3 categories:
>
> A) Does not expect incoming FlowFiles (UI should not allow you to even create a connection to the Processor;
> if one exists already, the processor should become invalid)
>
> B) Processor does expect incoming FlowFiles (Processor should become invalid until it has an incoming connection,
> just like it does if its outgoing connections are not all satisfied)
>
> C) Processor can take incoming FlowFiles but doesn't require them. I don't know that we have this
> use case in any of our Processors, but it is a valid use case, I think. In this case, the API needs to provide
> information to the Processor (via the ProcessContext) about whether or not it has any incoming connections.
> I believe I may have already created a ticket for this, but not sure.
>
> Does anybody have any thoughts on this?
>
>
> ----------------------------------------
>> Date: Fri, 5 Jun 2015 08:04:55 -0400
>> Subject: Re: Executing a python script with Execute Stream Command
>> From: [hidden email]
>> To: [hidden email]
>>
>> Aldrin,
>>
>> I want to thank you for your help. ExecuteProcess was the solution to my
>> problem.
>>
>> Thanks for everyone that helped.
>>
>> -Steve
>>
>> On Thu, Jun 4, 2015 at 2:06 PM, Aldrin Piri <[hidden email]> wrote:
>>
>>> Steve,
>>>
>>> I was able to mock up a flow myself and can provide a template to share
>>> with you that acts as I would anticipate. All of this is coming with the
>>> heavy caveat that I am not a Python master by any means.
>>>
>>> Before that, however, reading through the history, can you clarify if you
>>> are providing any input to the processor? Based on the context and noted
>>> behavior of the tasks/time increasing, my suspicion is that you are not,
>>> and the intent of the processor is not aligning with your expectations of
>>> this processor acting as a means of ingest into the flow. To that end, the
>>> intent of the ExecuteStreamProcessor as designed is to "... execute[s] an
>>> external command on the contents of a flow file, and create[s] a new flow
>>> file with the results of the command." Accordingly, if there is no input,
>>> the processor just returns after being allotted an execution cycle.
>>>
>>> I believe you may be after the ExecuteProcess processor which could be
>>> adapted to carry out execution without the need for input.
>>>
>>> Let us know if that is the case, if not, any additional clues will help us
>>> get to the issue.
>>>
>>> Thanks!
>>>
>>> --aldrin
>>>
>>> On Thu, Jun 4, 2015 at 12:51 PM, Stephen Pietrasko <
>>> [hidden email]> wrote:
>>>
>>>> Mark,
>>>>
>>>> Unfortunately that did not work. The Tasks/Time keep increasing but
>>> nothing
>>>> else.
>>>>
>>>> Thanks,
>>>> Steve
>>>>
>>>> On Thu, Jun 4, 2015 at 12:37 PM, Mark Payne <[hidden email]>
>>> wrote:
>>>>
>>>>> Stephen,
>>>>>
>>>>> The "Command Argument" property expects the arguments to be delimited
>>> by
>>>>> semi-colons, rather than spaces.
>>>>>
>>>>> Try changing that property to "nameofscript.py;-j;multiline" and see if
>>>>> that works for you.
>>>>>
>>>>> Thanks
>>>>> -Mark
>>>>>
>>>>> ----------------------------------------
>>>>>> Date: Thu, 4 Jun 2015 12:34:26 -0400
>>>>>> Subject: Re: Executing a python script with Execute Stream Command
>>>>>> From: [hidden email]
>>>>>> To: [hidden email]
>>>>>>
>>>>>> Mark,
>>>>>>
>>>>>> The properties I am using are as follows:
>>>>>>
>>>>>> Command Argument: nameofscript.py -j multine
>>>>>> Command Path: python
>>>>>> Working Directory /opt/dev/
>>>>>>
>>>>>>
>>>>>> Jimmy,
>>>>>>
>>>>>> Not exactly sure what you are asking with your question "Does the
>>>> python
>>>>>> script that you run from NiFi have a select set of Python packages
>>> you
>>>>> can
>>>>>> leverage in your python script. Is it at all possible to add
>>> additional
>>>>>> python packages?"
>>>>>>
>>>>>> Here is a sanitized version of the script. Are you asking if I can
>>>> import
>>>>>> more packages in my script? If so, yes, I can do that.
>>>>>>
>>>>>> http://pastebin.com/peSCkx6j
>>>>>>
>>>>>>
>>>>>> Thank you guys.
>>>>>>
>>>>>> -Steve
>>>>>>
>>>>>>
>>>>>> On Thu, Jun 4, 2015 at 9:57 AM, Mark Payne <[hidden email]>
>>>> wrote:
>>>>>>
>>>>>>> Stephen,
>>>>>>>
>>>>>>> Your understanding of the properties seem correct. Can you provide
>>> the
>>>>>>> properties that you're using to configure the processor?
>>>>>>>
>>>>>>> Thanks
>>>>>>> -mark
>>>>>>>
>>>>>>> ----------------------------------------
>>>>>>>> Date: Thu, 4 Jun 2015 09:51:46 -0400
>>>>>>>> Subject: Executing a python script with Execute Stream Command
>>>>>>>> From: [hidden email]
>>>>>>>> To: [hidden email]; [hidden email]
>>>>>>>>
>>>>>>>> All,
>>>>>>>>
>>>>>>>> I am trying to configure the Execute Stream Command processor to
>>>>> execute
>>>>>>> a
>>>>>>>> python script and have the output send to a queue with PutJMS.
>>>>>>>>
>>>>>>>> I'm having a bit of difficulty though. I've been looking at this
>>>>> previous
>>>>>>>> email chain which is similar to my issue.
>>>>>>>>
>>>>>
>>> https://www.mail-archive.com/dev@.../msg01578.html
>>>>>>>>
>>>>>>>> The script runs and sends the output to sys.stdout.write but when I
>>>> try
>>>>>>> and
>>>>>>>> have NiFi run the script I see no bytes in or out which means
>>> nothing
>>>>> is
>>>>>>>> passed to the queue.
>>>>>>>>
>>>>>>>> Would this be an issue with the output being sent to stdout or a
>>>>> property
>>>>>>>> issue with ExecuteStreamCommand.
>>>>>>>>
>>>>>>>> I have tried several configurations of the property fields. This is
>>>> my
>>>>>>>> general understanding of each field and what they should be:
>>>>>>>>
>>>>>>>> Command Argument: name of script and arguments
>>>>>>>> Command Path: python
>>>>>>>> Working Directory: Directory where script is located.
>>>>>>>>
>>>>>>>> Any help would be greatly appreciated.
>>>>>>>>
>>>>>>>> --
>>>>>>>> V/R
>>>>>>>>
>>>>>>>> Stephen M. Pietrasko
>>>>>>>> Security Engineer
>>>>>>>
>>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>>
>>
>> --
>> V/R
>>
>> Stephen M. Pietrasko
>> Security Engineer
>> G2-Inc
>> 301-575-5142
>> www.g2-inc.com
>