Nifi ExecuteScript slow performance

classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

Nifi ExecuteScript slow performance

balacode63
Dear All,

I've added a ExecuteScript in python. This script is a simple script which will just add one attribute to the flowfile. The flow is definded as below,

   1. Listen from mqtt (ConsumeMQTT)
   2. Add attribute to nifi flow (Execute script)
   3. Write to a file/post/ or some custom logic

But its taking 3 seconds to process. Please guide me.

Thanks,
Bala
Reply | Threaded
Open this post in threaded view
|

Re: Nifi ExecuteScript slow performance

Joe Witt
Bala,

Are you saying that step 2 (executing the script) is taking three
seconds?  Is that per message?  Can you show the log or screenshot of
how you're tracking that?

Thanks
Joe

On Thu, Oct 27, 2016 at 6:02 AM, balacode63 <[hidden email]> wrote:

> Dear All,
>
> I've added a ExecuteScript in python. This script is a simple script which
> will just add one attribute to the flowfile. The flow is definded as below,
>
>    1. Listen from mqtt (ConsumeMQTT)
>    2. Add attribute to nifi flow (Execute script)
>    3. Write to a file/post/ or some custom logic
>
> But its taking 3 seconds to process. Please guide me.
>
> Thanks,
> Bala
>
>
>
> --
> View this message in context: http://apache-nifi-developer-list.39713.n7.nabble.com/Nifi-ExecuteScript-slow-performance-tp13735.html
> Sent from the Apache NiFi Developer List mailing list archive at Nabble.com.
Reply | Threaded
Open this post in threaded view
|

Re: Nifi ExecuteScript slow performance

Matt Burgess-2
Bala,

Jython in ExecuteScript is noticably slower than other languages like
Javascript and Groovy, but it shouldn't be that slow. Can you share
your script code? Also, is this just to investigate before writing a
more complex script in ExecuteScript, or do you just want to add an
attribute to a flowfile? If the latter, UpdateAttribute can do that,
but I suspect there is more to it than that (custom logic, e.g.).

Regards,
Matt

On Thu, Oct 27, 2016 at 8:08 AM, Joe Witt <[hidden email]> wrote:

> Bala,
>
> Are you saying that step 2 (executing the script) is taking three
> seconds?  Is that per message?  Can you show the log or screenshot of
> how you're tracking that?
>
> Thanks
> Joe
>
> On Thu, Oct 27, 2016 at 6:02 AM, balacode63 <[hidden email]> wrote:
>> Dear All,
>>
>> I've added a ExecuteScript in python. This script is a simple script which
>> will just add one attribute to the flowfile. The flow is definded as below,
>>
>>    1. Listen from mqtt (ConsumeMQTT)
>>    2. Add attribute to nifi flow (Execute script)
>>    3. Write to a file/post/ or some custom logic
>>
>> But its taking 3 seconds to process. Please guide me.
>>
>> Thanks,
>> Bala
>>
>>
>>
>> --
>> View this message in context: http://apache-nifi-developer-list.39713.n7.nabble.com/Nifi-ExecuteScript-slow-performance-tp13735.html
>> Sent from the Apache NiFi Developer List mailing list archive at Nabble.com.
Reply | Threaded
Open this post in threaded view
|

Re: Nifi ExecuteScript slow performance

balacode63
Thanks for the reply matt,
Jython is slow, As you suggested we moved to groovy. ExecuteScript is taking more than 2 seconds to execute below high level logic

The logic:
1) Split data from flowfile
2) Query influxdb with splitted data (influxdb jar file used/external dependency)
3) write to influxdb
4) write flow file attributes

Without influxdb calling, the code is fast. Can you suggest me about improving the performance of the below mentioned groovy script

import org.apache.commons.io.IOUtils
import org.apache.nifi.processor.io.StreamCallback
import java.nio.charset.StandardCharsets

import java.io.IOException;
import java.util.concurrent.TimeUnit;
import java.util.logging.Level;
import java.util.logging.Logger;
import org.influxdb.InfluxDB;
import org.influxdb.InfluxDB.ConsistencyLevel;
import org.influxdb.InfluxDBFactory;
import org.influxdb.dto.Query;
import org.influxdb.dto.QueryResult;
import org.influxdb.dto.BatchPoints;
import org.influxdb.dto.Point;
import groovy.transform.Field
import java.util.concurrent.TimeUnit;

@Field String Field1=""
@Field String Field2=""
@Field String Field3=""
@Field String strTime=""

def strData="";

def flowFile = session.get()
if (!flowFile) return

flowFile = session.write(flowFile, { inputStream, outputStream ->
    strSource = IOUtils.toString(inputStream, StandardCharsets.UTF_8)
    String[]lstSource = strSource.split(/,|=|\s/);
    Field1=lstSource[2]
    Field2=lstSource[4]
    Field3=lstSource[6]
    strTime=lstSource[7]
    InfluxDB influxDB = InfluxDBFactory.connect("http://host", "root", "root");
    strDbName="dbname"
    if(Field3=="1"){
      strQuery = "SELECT last(endtime)  FROM table where endtime<=" + strTime
      Query query = new Query(strQuery, strDbName);
      QueryResult result = influxDB.query(query);

      // iterate the results and print details
      Field1=result.toString()
      strData = result.toString()
    }

    BatchPoints batchPoints = BatchPoints
            .database(strDbName)
            .retentionPolicy("default")
            .consistency(InfluxDB.ConsistencyLevel.ALL)
            .build();
    Point point1 = Point.measurement("myMeasurement")
            .time(System.currentTimeMillis(), TimeUnit.MILLISECONDS)
            .addField("f1",Field1)
            .addField("f2",Field2)
            .addField("f3",Field3)
            .build();
    batchPoints.point(point1);
    influxDB.write(strDbName, "default", point1);
   outputStream.write(strData.getBytes(StandardCharsets.UTF_8))
} as StreamCallback)
flowFile = session.putAttribute(flowFile, "att_f1","Field1")
flowFile = session.putAttribute(flowFile, "att_f2","Field2")
flowFile = session.putAttribute(flowFile, "att_f3","Field3")
session.transfer(flowFile, REL_SUCCESS)

Thanks
Bala
Reply | Threaded
Open this post in threaded view
|

Re: Nifi ExecuteScript slow performance

balacode63
In reply to this post by Joe Witt
Thanks for the reply Joe,

As matt suggested we moved to groovy.

Thanks
Bala
Reply | Threaded
Open this post in threaded view
|

Re: Nifi ExecuteScript slow performance

Vyshali
In reply to this post by Matt Burgess-2
Hi Matt,

I'm using Jython in executescript because of my requirement.I cant switch to
groovy because I'm using packages supported by Python.Is there any way to
increase the speed of the executescript processor.Please help me with your
ideas.

Thanks,
Vyshali



--
Sent from: http://apache-nifi-developer-list.39713.n7.nabble.com/
Reply | Threaded
Open this post in threaded view
|

Re: Nifi ExecuteScript slow performance

Matt Burgess-2
Vyshali,

Jython itself is known to be relatively slow, so using the scripting
processors you'll always be up against that. I see a couple of options
to improve performance:

1) Use InvokeScriptedProcessor (ISP) instead of ExecuteScript. ISP is
faster because it only loads the script once, then invokes methods on
it, rather than ExecuteScript which evaluates the script each time.  I
have an ISP template in Jython [1] which should make porting your
ExecuteScript code easier.
2) Use ExecuteStreamCommand with command-line Python instead. You
won't have the flexibility of accessing attributes, processor state,
etc. but if you're just transforming content you should find
ExecuteStreamCommand with Python faster.

Regards,
Matt

[1] http://funnifi.blogspot.com/2017/11/invokescriptedprocessor-template.html


On Sun, Nov 12, 2017 at 7:31 PM, Vyshali <[hidden email]> wrote:

> Hi Matt,
>
> I'm using Jython in executescript because of my requirement.I cant switch to
> groovy because I'm using packages supported by Python.Is there any way to
> increase the speed of the executescript processor.Please help me with your
> ideas.
>
> Thanks,
> Vyshali
>
>
>
> --
> Sent from: http://apache-nifi-developer-list.39713.n7.nabble.com/
Reply | Threaded
Open this post in threaded view
|

Re: Nifi ExecuteScript slow performance

Vyshali
Thank you so much Matt.
I will try the solutions provided and come back in case of questions.

Thanks,
Vyshali



--
Sent from: http://apache-nifi-developer-list.39713.n7.nabble.com/
Reply | Threaded
Open this post in threaded view
|

Re: Nifi ExecuteScript slow performance

Mike Thomsen
Vyshali,

Another trick you can do that works very well if you have a lot of
flowfiles to process is use session.get(int) to grab a batch. Obviously,
keep the volume you grab tuned to reasonable memory limits and all that,
but you can use that to make the script do a lot more work in one single
run. I have a flow that ends up processing millions of JSON files, and
grabbing 500 of them at once and merging them into a JSON array to create a
record batch really helps.

On Mon, Nov 13, 2017 at 12:23 PM, Vyshali <[hidden email]> wrote:

> Thank you so much Matt.
> I will try the solutions provided and come back in case of questions.
>
> Thanks,
> Vyshali
>
>
>
> --
> Sent from: http://apache-nifi-developer-list.39713.n7.nabble.com/
>