How to iterate through complex JSON objects.

classic Classic list List threaded Threaded
11 messages Options
Reply | Threaded
Open this post in threaded view
|

How to iterate through complex JSON objects.

shweta
Hi All,

I have a JSON which looks as following:-

item[{
    "tags": ["Java","Hadoop","nimbus"]
    "id": "2233"
    "title": "testing with Java"
    "comments":[{"post_id":"2233","body":"try option1"} , {"post_id":"2233","body":"try option2"} , {"post_id":"2233","body":"try option3"}]},

{
    "tags": ["Java","Hadoop"]
    "id": "2232"
    "title": "testing with Java"
    "comments":[{"post_id":"2232","body":"try option1"} , {"post_id":"2232","body":"try option2"} , {"post_id":"2232","body":"try option3"},

{
    "tags": ["Java"]
    "id": "2231"
    "title": "testing with Java"
    "comments":[{"post_id":"2231","body":"try option1"} , {"post_id":"2231","body":"try option2"} , {"post_id":"2231","body":"try option3"}
]

I need to convert the JSON to CSV.

Id                     Tags                                  Title                                        Body

2233  , <java><Hadoop><Nimbus>  ,   "testing with Java"   ,  <"try option1"><"try option2"><"try option3">

2232  , <java><Hadoop>  ,   "testing with Java"   ,  <"try option1"><"try option2"><"try option3">
.
.
I used a combination of EvaluateJSONPath and Replace Text for the same.  First issue I'm facing is in parsing an array (Tags and Body). I couldn't figure out how can iterate over the array of JSON values.

For Simplicity sake I configured EvaluateJSONPath with values shown in the image:-

EvaluateJSONPath_properties

and Replace Text processor with
Regex expression as [\S\s]+ and Replacement Values as "${Body}","${Tags},"${Post_id}","${Title}".

It works for single record.

Can you please provide pointer how to iterate through value of arrays.

Thanks,
Shweta




Reply | Threaded
Open this post in threaded view
|

Re: How to iterate through complex JSON objects.

Joe Percivall
Hello Shweta,

Where did you get that JSON? I ask because it's not valid. I put it in a JSONPath evaluator[1] and cleaned it up:


{
"item":[
{
"tags": ["Java","Hadoop","nimbus"],
"id": "2233",
"title": "testing with Java",
"comments":[{"post_id":"2233","body":"try option1"} ,
{"post_id":"2233","body":"try option2"} , {"post_id":"2233","body":"tryoption3"}
]
},

{
"tags": ["Java","Hadoop"],
"id": "2232",
"title": "testing with Java",
"comments":[{"post_id":"2232","body":"try option1"} ,
{"post_id":"2232","body":"try option2"} , {"post_id":"2232","body":"tryoption3"}
]
},

{
"tags": ["Java"],
"id": "2231",
"title": "testing with Java",
"comments":[{"post_id":"2231","body":"try option1"} ,
{"post_id":"2231","body":"try option2"} , {"post_id":"2231","body":"tryoption3"}
]
}
]
}

You were missing the proper beginning and some commas.

As for "iterating" over them you want to use the "*" path operator like such:

$.item.*.id

This path returns the ids for each item:

'0' => "2233"
'1' => "2232"
'2' => "2231"


[1] http://jsonpath.com/
Let me know if you have any other issues,
Joe
 - - - - - -
Joseph Percivall
linkedin.com/in/Percivall
e: [hidden email]




On Monday, December 14, 2015 12:57 PM, shweta <[hidden email]> wrote:
Hi All,

I have a JSON which looks as following:-

item[{
    "tags": ["Java","Hadoop","nimbus"]
    "id": "2233"
    "title": "testing with Java"
    "comments":[{"post_id":"2233","body":"try option1"} ,
{"post_id":"2233","body":"try option2"} , {"post_id":"2233","body":"try
option3"}]},

{
    "tags": ["Java","Hadoop"]
    "id": "2232"
    "title": "testing with Java"
    "comments":[{"post_id":"2232","body":"try option1"} ,
{"post_id":"2232","body":"try option2"} , {"post_id":"2232","body":"try
option3"},

{
    "tags": ["Java"]
    "id": "2231"
    "title": "testing with Java"
    "comments":[{"post_id":"2231","body":"try option1"} ,
{"post_id":"2231","body":"try option2"} , {"post_id":"2231","body":"try
option3"}
]

I need to convert the JSON to CSV.

Id                     Tags                                  Title                                      
Body

2233  , <java><Hadoop><Nimbus>  ,   "testing with Java"   ,  <"try
option1"><"try option2"><"try option3">

2232  , <java><Hadoop>  ,   "testing with Java"   ,  <"try option1"><"try
option2"><"try option3">
.
.
I used a combination of EvaluateJSONPath and Replace Text for the same.
First issue I'm facing is in parsing an array (Tags and Body). I couldn't
figure out how can iterate over the array of JSON values.

For Simplicity sake I configured EvaluateJSONPath with values shown in the
image:-

<http://apache-nifi-developer-list.39713.n7.nabble.com/file/n5776/EvaluateJsonPath.png>

and Replace Text processor with
Regex expression as [\S\s]+ and Replacement Values as
"${Body}","${Tags},"${Post_id}","${Title}".

It works for single record.

Can you please provide pointer how to iterate through value of arrays.

Thanks,
Shweta








--
View this message in context: http://apache-nifi-developer-list.39713.n7.nabble.com/How-to-iterate-through-complex-JSON-objects-tp5776.html
Sent from the Apache NiFi Developer List mailing list archive at Nabble.com.
Reply | Threaded
Open this post in threaded view
|

Re: How to iterate through complex JSON objects.

shweta
Hi Joe,

Thanks for the quick response. My bad I did not verify the JSON. It was a typo.
I could get the array of values I wanted.
But now the problem is since its not a scalar value being returned, I'm not able to store it in a variable. EvaluateJSONPath
throws an exception that unable to return a scalar value. The evaluated value shown in exception however
is correct.
How can I store non-scalar value as flow file attributes .

Thanks,
Shweta
Reply | Threaded
Open this post in threaded view
|

Re: How to iterate through complex JSON objects.

shweta
Just figured out that by specifying the Return Type as Json in "EvaluateJsonPath" processor I got the entire array of values. So for JSON path expression  "$.item.*.id","$.item.*.title" , I got  ["2233","2232","2231"],["testing with Java","testing with Java","testing with Java"]
I'm just trying to figure out how I can transpose it and instead get something like this

2233, "testing with Java"
2232, "testing with Java"
2231, "testing with Java"

to generate my desired csv.
Reply | Threaded
Open this post in threaded view
|

Re: How to iterate through complex JSON objects.

Bryan Bende
As an alternative approach, could you use SplitJSON first to split on the
items array?

You would get a FlowFile for each item, then when you use EvaluateJSONPath
you would be dealing with only a single FlowFile so you could extract the
id and title and use ReplaceText like you were already doing.

Then use MergeContent at the end to merge them back together, or depending
what you are doing maybe they don't need to be merged.

On Tue, Dec 15, 2015 at 3:27 AM, shweta <[hidden email]> wrote:

> Just figured out that by specifying the Return Type as Json in
> "EvaluateJsonPath" processor I got the entire array of values. So for JSON
> path expression  "$.item.*.id","$.item.*.title" , I got
> ["2233","2232","2231"],["testing with Java","testing with Java","testing
> with Java"]
> I'm just trying to figure out how I can transpose it and instead get
> something like this
>
> 2233, "testing with Java"
> 2232, "testing with Java"
> 2231, "testing with Java"
>
> to generate my desired csv.
>
>
>
> --
> View this message in context:
> http://apache-nifi-developer-list.39713.n7.nabble.com/How-to-iterate-through-complex-JSON-objects-tp5776p5791.html
> Sent from the Apache NiFi Developer List mailing list archive at
> Nabble.com.
>
Reply | Threaded
Open this post in threaded view
|

Re: How to iterate through complex JSON objects.

shweta
Thanks Bryan!! Infact I followed the exact approach that you told. Just that I was clueless about using Mergecontent processor. So I wrote my custom script to combine the different outputs and executed it using Execute Stream command.
Will try the same with Mergecontent.
Reply | Threaded
Open this post in threaded view
|

Re: How to iterate through complex JSON objects.

mring33621
Shweta,

I think your issue demonstrates one of my minor complaints with NiFi --
that you always have to think in terms of several little, built-in pieces
to get a simple job done. Sometimes it's fun, like a puzzle, but other
times, I don't feel like dealing with it. That's why I wrote this:
https://github.com/mring33621/scripting-for-nifi. A short, custom JS or
Groovy script could have handled your JSON data munging in a single stroke.

-Matt

On Tue, Dec 15, 2015 at 8:40 PM, shweta <[hidden email]> wrote:

> Thanks Bryan!! Infact I followed the exact approach that you told. Just
> that
> I was clueless about using Mergecontent processor. So I wrote my custom
> script to combine the different outputs and executed it using Execute
> Stream
> command.
> Will try the same with Mergecontent.
>
>
>
> --
> View this message in context:
> http://apache-nifi-developer-list.39713.n7.nabble.com/How-to-iterate-through-complex-JSON-objects-tp5776p5806.html
> Sent from the Apache NiFi Developer List mailing list archive at
> Nabble.com.
>
Reply | Threaded
Open this post in threaded view
|

Re: How to iterate through complex JSON objects.

Joe Witt
It is a fair criticism that sometimes the cohesion level of processors
can be simply too much.  Early on I used to 'fight' to find the right
abstraction and argue that others do the same.  But what I've found is
that it is better to let it happen naturally and to offer options.
Matt, I think your approach of giving yourself an option to break into
scripting in the middle of the flow in a way that lets you mangle data
as needed but benefitting from the strength of the framework is
perfect.  Matt Burgess is working on NIFI-210 to incorporate those
languages and many others.

Thanks
Joe

On Wed, Dec 16, 2015 at 8:27 AM, Angry Duck Studio
<[hidden email]> wrote:

> Shweta,
>
> I think your issue demonstrates one of my minor complaints with NiFi --
> that you always have to think in terms of several little, built-in pieces
> to get a simple job done. Sometimes it's fun, like a puzzle, but other
> times, I don't feel like dealing with it. That's why I wrote this:
> https://github.com/mring33621/scripting-for-nifi. A short, custom JS or
> Groovy script could have handled your JSON data munging in a single stroke.
>
> -Matt
>
> On Tue, Dec 15, 2015 at 8:40 PM, shweta <[hidden email]> wrote:
>
>> Thanks Bryan!! Infact I followed the exact approach that you told. Just
>> that
>> I was clueless about using Mergecontent processor. So I wrote my custom
>> script to combine the different outputs and executed it using Execute
>> Stream
>> command.
>> Will try the same with Mergecontent.
>>
>>
>>
>> --
>> View this message in context:
>> http://apache-nifi-developer-list.39713.n7.nabble.com/How-to-iterate-through-complex-JSON-objects-tp5776p5806.html
>> Sent from the Apache NiFi Developer List mailing list archive at
>> Nabble.com.
>>
Reply | Threaded
Open this post in threaded view
|

Re: How to iterate through complex JSON objects.

Matt Burgess
All,

I have submitted a patch for NIFI-210 to offer scripting capabilities, my GitHub feature branch is at:

https://github.com/mattyb149/nifi/tree/script-processors


I would truly appreciate any comments, questions, or suggestions about this capability.

Regards,
Matt




On 12/16/15, 11:41 AM, "Joe Witt" <[hidden email]> wrote:

>It is a fair criticism that sometimes the cohesion level of processors
>can be simply too much.  Early on I used to 'fight' to find the right
>abstraction and argue that others do the same.  But what I've found is
>that it is better to let it happen naturally and to offer options.
>Matt, I think your approach of giving yourself an option to break into
>scripting in the middle of the flow in a way that lets you mangle data
>as needed but benefitting from the strength of the framework is
>perfect.  Matt Burgess is working on NIFI-210 to incorporate those
>languages and many others.
>
>Thanks
>Joe
>
>On Wed, Dec 16, 2015 at 8:27 AM, Angry Duck Studio
><[hidden email]> wrote:
>> Shweta,
>>
>> I think your issue demonstrates one of my minor complaints with NiFi --
>> that you always have to think in terms of several little, built-in pieces
>> to get a simple job done. Sometimes it's fun, like a puzzle, but other
>> times, I don't feel like dealing with it. That's why I wrote this:
>> https://github.com/mring33621/scripting-for-nifi. A short, custom JS or
>> Groovy script could have handled your JSON data munging in a single stroke.
>>
>> -Matt
>>
>> On Tue, Dec 15, 2015 at 8:40 PM, shweta <[hidden email]> wrote:
>>
>>> Thanks Bryan!! Infact I followed the exact approach that you told. Just
>>> that
>>> I was clueless about using Mergecontent processor. So I wrote my custom
>>> script to combine the different outputs and executed it using Execute
>>> Stream
>>> command.
>>> Will try the same with Mergecontent.
>>>
>>>
>>>
>>> --
>>> View this message in context:
>>> http://apache-nifi-developer-list.39713.n7.nabble.com/How-to-iterate-through-complex-JSON-objects-tp5776p5806.html
>>> Sent from the Apache NiFi Developer List mailing list archive at
>>> Nabble.com.
>>>

Reply | Threaded
Open this post in threaded view
|

Re: How to iterate through complex JSON objects.

Igor K.
In reply to this post by mring33621
The question is "Is NiFi supposed to be a full ETL tool"?
On Dec 16, 2015 11:27 AM, "Angry Duck Studio" <[hidden email]>
wrote:

> Shweta,
>
> I think your issue demonstrates one of my minor complaints with NiFi --
> that you always have to think in terms of several little, built-in pieces
> to get a simple job done. Sometimes it's fun, like a puzzle, but other
> times, I don't feel like dealing with it. That's why I wrote this:
> https://github.com/mring33621/scripting-for-nifi. A short, custom JS or
> Groovy script could have handled your JSON data munging in a single stroke.
>
> -Matt
>
> On Tue, Dec 15, 2015 at 8:40 PM, shweta <[hidden email]> wrote:
>
> > Thanks Bryan!! Infact I followed the exact approach that you told. Just
> > that
> > I was clueless about using Mergecontent processor. So I wrote my custom
> > script to combine the different outputs and executed it using Execute
> > Stream
> > command.
> > Will try the same with Mergecontent.
> >
> >
> >
> > --
> > View this message in context:
> >
> http://apache-nifi-developer-list.39713.n7.nabble.com/How-to-iterate-through-complex-JSON-objects-tp5776p5806.html
> > Sent from the Apache NiFi Developer List mailing list archive at
> > Nabble.com.
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: How to iterate through complex JSON objects.

Joe Witt
Igor,

The term ETL has a lot of baggage associated with it.  What NiFi was
built to do is dataflow management.  There are already a lot of tools
out there that address the typical relational database ETL space and
NiFi doesn't need to replicate all of those functions.  So probably
best to just focus on use cases/problems and see if NiFi handles them
nicely now, doesn't handle them nicely now but should be made to do
so, or doesn't handle them nicely and it should always be left to some
other system.

Thanks
Joe

On Wed, Dec 16, 2015 at 7:10 PM, Igor Kravzov <[hidden email]> wrote:

> The question is "Is NiFi supposed to be a full ETL tool"?
> On Dec 16, 2015 11:27 AM, "Angry Duck Studio" <[hidden email]>
> wrote:
>
>> Shweta,
>>
>> I think your issue demonstrates one of my minor complaints with NiFi --
>> that you always have to think in terms of several little, built-in pieces
>> to get a simple job done. Sometimes it's fun, like a puzzle, but other
>> times, I don't feel like dealing with it. That's why I wrote this:
>> https://github.com/mring33621/scripting-for-nifi. A short, custom JS or
>> Groovy script could have handled your JSON data munging in a single stroke.
>>
>> -Matt
>>
>> On Tue, Dec 15, 2015 at 8:40 PM, shweta <[hidden email]> wrote:
>>
>> > Thanks Bryan!! Infact I followed the exact approach that you told. Just
>> > that
>> > I was clueless about using Mergecontent processor. So I wrote my custom
>> > script to combine the different outputs and executed it using Execute
>> > Stream
>> > command.
>> > Will try the same with Mergecontent.
>> >
>> >
>> >
>> > --
>> > View this message in context:
>> >
>> http://apache-nifi-developer-list.39713.n7.nabble.com/How-to-iterate-through-complex-JSON-objects-tp5776p5806.html
>> > Sent from the Apache NiFi Developer List mailing list archive at
>> > Nabble.com.
>> >
>>