Provenance graph improvement

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

Provenance graph improvement

Philip Young
Hi,

I know that the provenance graph view has gone through a couple of different iterations but I always get the feeling when using it that it is like the poor cousin compared to the main interface.

What I would like to see, as mentioned in another discussion, is some awesome sauce applied to this area. It would be more useful to view the exact path that a flowfile has travelled on the main interface. The concept of the timeline from the existing provenance view could be integrated into the main interface when in provenance view mode. When the timeline is advanced, the processor and connectors that the flowfile is currently in could be highlighted (additional data about the type of provenance event could also be augmented in the view). If there is cloning, then cloned flowfile could also be shown in the animation.

I believe that this approach of reusing the main interface for provenance makes more sense, as users are already familiar with their graph and it would be more intuitive to view provenance on the graph that the flowfile actually travelled.

Hope that makes sense. Would be interested in others thoughts on this.

Cheers
Phil Young

Reply | Threaded
Open this post in threaded view
|

Re: Provenance graph improvement

Matt Gilman
Phil,

Thanks for the feedback. I agree that time spent iterating on the
Provenance UI would not be wasteful. Pulling in elements from the main
canvas seems like a reasonable place to start.

However, that graph is all about the data lineage, not the data flow. The
necessary events to tell the entire story often times comes from the same
processor. What I mean here, is that processors are able to emit as many
provenance events as necessary to describe what it actually did to the
data. Additionally, some events do not originate from processors. Some
actions like REPLAYing or DOWNLOADing come from actions users have with the
Provenance UI. Simply showing the components from the main canvas may lose
important granularity.

Another important note here is that the lineage is a view of time. This
means that the graph starts with the oldest events and continues throughout
the lineage of that data in a linear fashion. This makes it really easy to
comprehend what happened to the data and in what order. Showing this
timeline using the data flow graph can quickly become confusing when the
route the data took starts looping, forking, joining, etc.

If I've not understood your suggestion completely, please let me know. Also
looking forward to other thoughts...

Thanks!

Matt



On Thu, Jan 15, 2015 at 11:54 PM, Philip Young <[hidden email]> wrote:

> Hi,
>
> I know that the provenance graph view has gone through a couple of
> different iterations but I always get the feeling when using it that it is
> like the poor cousin compared to the main interface.
>
> What I would like to see, as mentioned in another discussion, is some
> awesome sauce applied to this area. It would be more useful to view the
> exact path that a flowfile has travelled on the main interface. The concept
> of the timeline from the existing provenance view could be integrated into
> the main interface when in provenance view mode. When the timeline is
> advanced, the processor and connectors that the flowfile is currently in
> could be highlighted (additional data about the type of provenance event
> could also be augmented in the view). If there is cloning, then cloned
> flowfile could also be shown in the animation.
>
> I believe that this approach of reusing the main interface for provenance
> makes more sense, as users are already familiar with their graph and it
> would be more intuitive to view provenance on the graph that the flowfile
> actually travelled.
>
> Hope that makes sense. Would be interested in others thoughts on this.
>
> Cheers
> Phil Young
>
>
Reply | Threaded
Open this post in threaded view
|

RE: Provenance graph improvement

johny casanova
would be nice if we can use graphs like the d3 ones.  https://github.com/mbostock/d3/wiki/Gallery 

> Date: Fri, 16 Jan 2015 07:43:26 -0500
> Subject: Re: Provenance graph improvement
> From: [hidden email]
> To: [hidden email]
>
> Phil,
>
> Thanks for the feedback. I agree that time spent iterating on the
> Provenance UI would not be wasteful. Pulling in elements from the main
> canvas seems like a reasonable place to start.
>
> However, that graph is all about the data lineage, not the data flow. The
> necessary events to tell the entire story often times comes from the same
> processor. What I mean here, is that processors are able to emit as many
> provenance events as necessary to describe what it actually did to the
> data. Additionally, some events do not originate from processors. Some
> actions like REPLAYing or DOWNLOADing come from actions users have with the
> Provenance UI. Simply showing the components from the main canvas may lose
> important granularity.
>
> Another important note here is that the lineage is a view of time. This
> means that the graph starts with the oldest events and continues throughout
> the lineage of that data in a linear fashion. This makes it really easy to
> comprehend what happened to the data and in what order. Showing this
> timeline using the data flow graph can quickly become confusing when the
> route the data took starts looping, forking, joining, etc.
>
> If I've not understood your suggestion completely, please let me know. Also
> looking forward to other thoughts...
>
> Thanks!
>
> Matt
>
>
>
> On Thu, Jan 15, 2015 at 11:54 PM, Philip Young <[hidden email]> wrote:
>
> > Hi,
> >
> > I know that the provenance graph view has gone through a couple of
> > different iterations but I always get the feeling when using it that it is
> > like the poor cousin compared to the main interface.
> >
> > What I would like to see, as mentioned in another discussion, is some
> > awesome sauce applied to this area. It would be more useful to view the
> > exact path that a flowfile has travelled on the main interface. The concept
> > of the timeline from the existing provenance view could be integrated into
> > the main interface when in provenance view mode. When the timeline is
> > advanced, the processor and connectors that the flowfile is currently in
> > could be highlighted (additional data about the type of provenance event
> > could also be augmented in the view). If there is cloning, then cloned
> > flowfile could also be shown in the animation.
> >
> > I believe that this approach of reusing the main interface for provenance
> > makes more sense, as users are already familiar with their graph and it
> > would be more intuitive to view provenance on the graph that the flowfile
> > actually travelled.
> >
> > Hope that makes sense. Would be interested in others thoughts on this.
> >
> > Cheers
> > Phil Young
> >
> >
     
Reply | Threaded
Open this post in threaded view
|

Re: Provenance graph improvement

Matt Gilman
Johny,

What were you thinking? The main canvas, stats history, and provenance
lineage is already implemented using D3. So incorporating a particular
technique or effect wouldn't be difficult if it made sense. Thanks.

Matt

On Fri, Jan 16, 2015 at 7:48 AM, johny casanova <[hidden email]>
wrote:

> would be nice if we can use graphs like the d3 ones.
> https://github.com/mbostock/d3/wiki/Gallery
>
> > Date: Fri, 16 Jan 2015 07:43:26 -0500
> > Subject: Re: Provenance graph improvement
> > From: [hidden email]
> > To: [hidden email]
> >
> > Phil,
> >
> > Thanks for the feedback. I agree that time spent iterating on the
> > Provenance UI would not be wasteful. Pulling in elements from the main
> > canvas seems like a reasonable place to start.
> >
> > However, that graph is all about the data lineage, not the data flow. The
> > necessary events to tell the entire story often times comes from the same
> > processor. What I mean here, is that processors are able to emit as many
> > provenance events as necessary to describe what it actually did to the
> > data. Additionally, some events do not originate from processors. Some
> > actions like REPLAYing or DOWNLOADing come from actions users have with
> the
> > Provenance UI. Simply showing the components from the main canvas may
> lose
> > important granularity.
> >
> > Another important note here is that the lineage is a view of time. This
> > means that the graph starts with the oldest events and continues
> throughout
> > the lineage of that data in a linear fashion. This makes it really easy
> to
> > comprehend what happened to the data and in what order. Showing this
> > timeline using the data flow graph can quickly become confusing when the
> > route the data took starts looping, forking, joining, etc.
> >
> > If I've not understood your suggestion completely, please let me know.
> Also
> > looking forward to other thoughts...
> >
> > Thanks!
> >
> > Matt
> >
> >
> >
> > On Thu, Jan 15, 2015 at 11:54 PM, Philip Young <[hidden email]>
> wrote:
> >
> > > Hi,
> > >
> > > I know that the provenance graph view has gone through a couple of
> > > different iterations but I always get the feeling when using it that
> it is
> > > like the poor cousin compared to the main interface.
> > >
> > > What I would like to see, as mentioned in another discussion, is some
> > > awesome sauce applied to this area. It would be more useful to view the
> > > exact path that a flowfile has travelled on the main interface. The
> concept
> > > of the timeline from the existing provenance view could be integrated
> into
> > > the main interface when in provenance view mode. When the timeline is
> > > advanced, the processor and connectors that the flowfile is currently
> in
> > > could be highlighted (additional data about the type of provenance
> event
> > > could also be augmented in the view). If there is cloning, then cloned
> > > flowfile could also be shown in the animation.
> > >
> > > I believe that this approach of reusing the main interface for
> provenance
> > > makes more sense, as users are already familiar with their graph and it
> > > would be more intuitive to view provenance on the graph that the
> flowfile
> > > actually travelled.
> > >
> > > Hope that makes sense. Would be interested in others thoughts on this.
> > >
> > > Cheers
> > > Phil Young
> > >
> > >
>
>
Reply | Threaded
Open this post in threaded view
|

RE: Provenance graph improvement

johny casanova
I Apologize thats what I meant use more of the graphs. Can provenance show geotags? maybe some of the maps would be helpful. EX. if tracking data you can use the map and say "the data went to the server in this state to that state then to this server" does that make sense?

> Date: Fri, 16 Jan 2015 07:54:06 -0500
> Subject: Re: Provenance graph improvement
> From: [hidden email]
> To: [hidden email]
>
> Johny,
>
> What were you thinking? The main canvas, stats history, and provenance
> lineage is already implemented using D3. So incorporating a particular
> technique or effect wouldn't be difficult if it made sense. Thanks.
>
> Matt
>
> On Fri, Jan 16, 2015 at 7:48 AM, johny casanova <[hidden email]>
> wrote:
>
> > would be nice if we can use graphs like the d3 ones.
> > https://github.com/mbostock/d3/wiki/Gallery
> >
> > > Date: Fri, 16 Jan 2015 07:43:26 -0500
> > > Subject: Re: Provenance graph improvement
> > > From: [hidden email]
> > > To: [hidden email]
> > >
> > > Phil,
> > >
> > > Thanks for the feedback. I agree that time spent iterating on the
> > > Provenance UI would not be wasteful. Pulling in elements from the main
> > > canvas seems like a reasonable place to start.
> > >
> > > However, that graph is all about the data lineage, not the data flow. The
> > > necessary events to tell the entire story often times comes from the same
> > > processor. What I mean here, is that processors are able to emit as many
> > > provenance events as necessary to describe what it actually did to the
> > > data. Additionally, some events do not originate from processors. Some
> > > actions like REPLAYing or DOWNLOADing come from actions users have with
> > the
> > > Provenance UI. Simply showing the components from the main canvas may
> > lose
> > > important granularity.
> > >
> > > Another important note here is that the lineage is a view of time. This
> > > means that the graph starts with the oldest events and continues
> > throughout
> > > the lineage of that data in a linear fashion. This makes it really easy
> > to
> > > comprehend what happened to the data and in what order. Showing this
> > > timeline using the data flow graph can quickly become confusing when the
> > > route the data took starts looping, forking, joining, etc.
> > >
> > > If I've not understood your suggestion completely, please let me know.
> > Also
> > > looking forward to other thoughts...
> > >
> > > Thanks!
> > >
> > > Matt
> > >
> > >
> > >
> > > On Thu, Jan 15, 2015 at 11:54 PM, Philip Young <[hidden email]>
> > wrote:
> > >
> > > > Hi,
> > > >
> > > > I know that the provenance graph view has gone through a couple of
> > > > different iterations but I always get the feeling when using it that
> > it is
> > > > like the poor cousin compared to the main interface.
> > > >
> > > > What I would like to see, as mentioned in another discussion, is some
> > > > awesome sauce applied to this area. It would be more useful to view the
> > > > exact path that a flowfile has travelled on the main interface. The
> > concept
> > > > of the timeline from the existing provenance view could be integrated
> > into
> > > > the main interface when in provenance view mode. When the timeline is
> > > > advanced, the processor and connectors that the flowfile is currently
> > in
> > > > could be highlighted (additional data about the type of provenance
> > event
> > > > could also be augmented in the view). If there is cloning, then cloned
> > > > flowfile could also be shown in the animation.
> > > >
> > > > I believe that this approach of reusing the main interface for
> > provenance
> > > > makes more sense, as users are already familiar with their graph and it
> > > > would be more intuitive to view provenance on the graph that the
> > flowfile
> > > > actually travelled.
> > > >
> > > > Hope that makes sense. Would be interested in others thoughts on this.
> > > >
> > > > Cheers
> > > > Phil Young
> > > >
> > > >
> >
> >
     
Reply | Threaded
Open this post in threaded view
|

Re: Provenance graph improvement

Matt Gilman
By the way, I just realized I wrote 'What were you thinking?' and that came
across wrong. Sorry. The intent was more along the lines of 'What did you
have in mind?' We'll look through some more of those D3 examples to see if
any make sense.

Thanks.

Matt

On Fri, Jan 16, 2015 at 8:15 AM, johny casanova <[hidden email]>
wrote:

> I Apologize thats what I meant use more of the graphs. Can provenance show
> geotags? maybe some of the maps would be helpful. EX. if tracking data you
> can use the map and say "the data went to the server in this state to that
> state then to this server" does that make sense?
>
> > Date: Fri, 16 Jan 2015 07:54:06 -0500
> > Subject: Re: Provenance graph improvement
> > From: [hidden email]
> > To: [hidden email]
> >
> > Johny,
> >
> > What were you thinking? The main canvas, stats history, and provenance
> > lineage is already implemented using D3. So incorporating a particular
> > technique or effect wouldn't be difficult if it made sense. Thanks.
> >
> > Matt
> >
> > On Fri, Jan 16, 2015 at 7:48 AM, johny casanova <[hidden email]
> >
> > wrote:
> >
> > > would be nice if we can use graphs like the d3 ones.
> > > https://github.com/mbostock/d3/wiki/Gallery
> > >
> > > > Date: Fri, 16 Jan 2015 07:43:26 -0500
> > > > Subject: Re: Provenance graph improvement
> > > > From: [hidden email]
> > > > To: [hidden email]
> > > >
> > > > Phil,
> > > >
> > > > Thanks for the feedback. I agree that time spent iterating on the
> > > > Provenance UI would not be wasteful. Pulling in elements from the
> main
> > > > canvas seems like a reasonable place to start.
> > > >
> > > > However, that graph is all about the data lineage, not the data
> flow. The
> > > > necessary events to tell the entire story often times comes from the
> same
> > > > processor. What I mean here, is that processors are able to emit as
> many
> > > > provenance events as necessary to describe what it actually did to
> the
> > > > data. Additionally, some events do not originate from processors.
> Some
> > > > actions like REPLAYing or DOWNLOADing come from actions users have
> with
> > > the
> > > > Provenance UI. Simply showing the components from the main canvas may
> > > lose
> > > > important granularity.
> > > >
> > > > Another important note here is that the lineage is a view of time.
> This
> > > > means that the graph starts with the oldest events and continues
> > > throughout
> > > > the lineage of that data in a linear fashion. This makes it really
> easy
> > > to
> > > > comprehend what happened to the data and in what order. Showing this
> > > > timeline using the data flow graph can quickly become confusing when
> the
> > > > route the data took starts looping, forking, joining, etc.
> > > >
> > > > If I've not understood your suggestion completely, please let me
> know.
> > > Also
> > > > looking forward to other thoughts...
> > > >
> > > > Thanks!
> > > >
> > > > Matt
> > > >
> > > >
> > > >
> > > > On Thu, Jan 15, 2015 at 11:54 PM, Philip Young <[hidden email]>
> > > wrote:
> > > >
> > > > > Hi,
> > > > >
> > > > > I know that the provenance graph view has gone through a couple of
> > > > > different iterations but I always get the feeling when using it
> that
> > > it is
> > > > > like the poor cousin compared to the main interface.
> > > > >
> > > > > What I would like to see, as mentioned in another discussion, is
> some
> > > > > awesome sauce applied to this area. It would be more useful to
> view the
> > > > > exact path that a flowfile has travelled on the main interface. The
> > > concept
> > > > > of the timeline from the existing provenance view could be
> integrated
> > > into
> > > > > the main interface when in provenance view mode. When the timeline
> is
> > > > > advanced, the processor and connectors that the flowfile is
> currently
> > > in
> > > > > could be highlighted (additional data about the type of provenance
> > > event
> > > > > could also be augmented in the view). If there is cloning, then
> cloned
> > > > > flowfile could also be shown in the animation.
> > > > >
> > > > > I believe that this approach of reusing the main interface for
> > > provenance
> > > > > makes more sense, as users are already familiar with their graph
> and it
> > > > > would be more intuitive to view provenance on the graph that the
> > > flowfile
> > > > > actually travelled.
> > > > >
> > > > > Hope that makes sense. Would be interested in others thoughts on
> this.
> > > > >
> > > > > Cheers
> > > > > Phil Young
> > > > >
> > > > >
> > >
> > >
>
>