ExecuteSQL generated Avro schema

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

ExecuteSQL generated Avro schema

Nabegh

I'm trying to pull records from a database, split them, encode each record separately, send the encoded record over the network, and finally decode the record at the receiving end.

The processors I'm using are ExecuteSQL => ConvertAvroToJSON => SplitText => ExecuteScript => InvokeHTTP

ExecuteScript encodes JSON records using Avro APIs.

I'm facing a problem because of the way ExecuteSQL generates the Avro schema. ExecuteSQL creates a union type for all SQL types like ["null", "string"], for example. See here
 
I believe it was designed this way to support schema evolution. However, Avro on the other hand will not parse JSON records using this schema. See here

Appreciate your input.



Reply | Threaded
Open this post in threaded view
|

Re: ExecuteSQL generated Avro schema

Bryan Bende
Hello,

I know this is not a direct answer to your question, but if you want to
send Avro records at the end, is there a reason you couldn't do ExecuteSQL
-> SplitAvro -> InvokeHttp ?

I'm assuming there is more logic in ExecuteScript besides just converting
JSON to Avro, but wanted to make sure you really needed to go from Avro to
JSON and back to Avro.

Thanks,

Bryan

On Tue, Aug 9, 2016 at 2:07 PM, Nabegh <[hidden email]> wrote:

>
> I'm trying to pull records from a database, split them, encode each record
> separately, send the encoded record over the network, and finally decode
> the
> record at the receiving end.
>
> The processors I'm using are ExecuteSQL => ConvertAvroToJSON => SplitText
> =>
> ExecuteScript => InvokeHTTP
>
> ExecuteScript encodes JSON records using Avro APIs.
>
> I'm facing a problem because of the way ExecuteSQL generates the Avro
> schema. ExecuteSQL creates a union type for all SQL types like ["null",
> "string"], for example. See  here
> <https://github.com/apache/nifi/blob/master/nifi-nar-
> bundles/nifi-standard-bundle/nifi-standard-processors/src/
> main/java/org/apache/nifi/processors/standard/util/JdbcCommon.java#L177>
>
> I believe it was designed this way to support schema evolution. However,
> Avro on the other hand will not parse JSON records using this schema. See
> here
> <http://mail-archives.apache.org/mod_mbox/avro-user/201412.
> mbox/%3CCALEq1Z-sKNT-fBpMhAa%3DGTjLq5wuKf5mAuvLYos4Ba17hUi%
> 2Bfw%40mail.gmail.com%3E>
>
> Appreciate your input.
>
>
>
>
>
>
>
> --
> View this message in context: http://apache-nifi-developer-
> list.39713.n7.nabble.com/ExecuteSQL-generated-Avro-schema-tp13020.html
> Sent from the Apache NiFi Developer List mailing list archive at
> Nabble.com.
>
Reply | Threaded
Open this post in threaded view
|

Re: ExecuteSQL generated Avro schema

Nabegh
Hi Bryan,

SplitAvro will include the schema with the data. I'm trying to separate them to reduce the message size that is being sent through the network.

Thanks
Reply | Threaded
Open this post in threaded view
|

Re: ExecuteSQL generated Avro schema

Bryan Bende
SplitAvro should have a property called Output Strategy that can be changed
to "Bare Record" which should be the record without the schema.

Would that work?

On Tue, Aug 9, 2016 at 2:22 PM, Nabegh <[hidden email]> wrote:

> Hi Bryan,
>
> SplitAvro will include the schema with the data. I'm trying to separate
> them
> to reduce the message size that is being sent through the network.
>
> Thanks
>
>
>
> --
> View this message in context: http://apache-nifi-developer-
> list.39713.n7.nabble.com/ExecuteSQL-generated-Avro-
> schema-tp13020p13022.html
> Sent from the Apache NiFi Developer List mailing list archive at
> Nabble.com.
>
Reply | Threaded
Open this post in threaded view
|

Re: ExecuteSQL generated Avro schema

Nabegh
Yes. Thanks for the tip!