►
Description
John Joyce (Acryl Data) gives an update on recent improvements to fine-grained access control during the DataHub Community Town Hall on August 27, 2021.
A
Yeah,
so
I'm
going
to
do
a
quick
overview
of
where
we
are
in
fine
grained
access
control.
This
is
something
we
started
thinking
about
at
the
beginning
of
the
summer,
based
on
a
lot
of
feedback
from
the
community
around
wanting
this,
this
capability
to
control,
who
has
access
to
what
metadata
on
data
hub's
platform.
A
A
One
is
an
actor
which
determines
the
who
portion
of
the
the
policy
a
privilege,
so
what
action
they
can
perform
and
then,
finally,
a
resource
or
an
object.
This
is
commonly
kind
of
known
as
actor
verb
object
in
some
some
areas
of
the
world,
we
use
actor
privilege
resource
and
what
we
do
is
we
put
these
three
things
together
and
we
call
them
a
policy
so
with
the
new
kind
of
implementation,
we
allow
you
to
declare
a
policy
that
includes
these
three
things
to
control
access
on
data
hub.
A
So
I'm
just
going
to
talk
about
a
few
policies
in
english.
You
know
on
datahub's
platform,
you
may
want
to
kind
of
restrict
who
can
do
certain
things
so
number
one.
Maybe
data
set
owners
should
be
able
to
add
documentation,
but
they
shouldn't
be
able
to
add
tags.
So
we
want
a
controlled
vocabulary
of
tags.
Perhaps
another
example
is
maybe
the
data
platform
team
should
be
able
to
edit
anything
about
a
data
set
right
because
they
manage
the
platform
they're
sort
of
the
admins
of
datahub,
maybe
ted.
A
Our
data
steward
should
be
able
to
edit
any
data
sets
tags,
maybe
that's
his
job,
but
shouldn't
be
able
to
edit
the
description
or
the
ownership
or
anything
else.
And
finally,
maybe
the
administrative
group
should
be
able
to
manage
policies
themselves
right
so
should
be
able
to
dictate
who
can
do
what
on
the
platform?
A
We
wanted
to
apply
these
policies
to
resources
at
two
levels,
so
one
is
based
on
the
resource
type.
So
imagine
you
know.
Data
set
assets
or
dashboards
or
charts,
as
well
as
the
resource
identity
level,
so
to
be
able
to
call
out
a
particular
data
set
or
a
particular
chart
and
apply
fine-grained
access
control
against
that
asset
individually
and
then.
Finally,
we
wanted
to
model
this
concept
of
actors
using
our
concept
of
user
in
groups
that
already
exist.
A
So
we
wanted
to
be
able
to
say
that
john
should
be
able
to
do
something
to
a
particular
data
set,
or
maybe
a
group
should
be
able
to
do
something
to
a
particular
dashboard.
We
also
wanted
the
ability
to
support
sort
of
this
wild
card
predicate
and
say
all
users
or
all
groups
should
be
able
to
do
something
to
a
particular
asset.
A
A
So
now
I'm
going
to
go
into
a
demo
of
the
milestone,
one
implementation
of
policies
based
on
what
we
just
talked
about,
so
I'm
going
to
go
over
here
to
data
hub
and
you
know
right
off
the
bat
I'm
just
going
to
this
is
the
default
deployment
of
of
the
new
policies
world.
So
I'm
going
to
go
ahead
and
search.
I've
just
got
some
of
this.
You
know
basic
sample
metadata
in
here
that
you
guys
are
all
familiar
with.
A
Probably
I'm
going
to
go
to
this
first
data
set
and
I'm
going
to
try
to
add
a
tag
right
so
let's
say
new
tag.
Okay,
I
already
have
one
my
new
tag
and
what
you'll
notice
right
away
is
that
we've
got
a
warning
here,
which
says
looks
like
you're
unauthorized
to
perform
that
action.
So
why
would
that
be?
Well?
That's
because
we
haven't
defined
any
policies
yet
so
I,
by
default.
I
am
not
able
to
do
anything
to
this
data
set
right.
This
is
kind
of
a
fail
closed
world.
A
You
know
actor
object
privilege,
so
I'm
going
to
start
by
giving
my
policy
a
name
and
I'm
going
to
actually
use
the
example
from
the
the
slides,
I'm
going
to
say,
data
sets
owner's
documentation
policy
right.
So
basically,
I
want
to
say
that
owners
can
update
documentation,
but
that's
it
about
a
particular
data
set.
So
next
I'm
going
to
choose
the
type
of
the
policy.
A
Finally,
I'm
going
to
give
it
a
description
say
only
owners
should
be
allowed
or
sorry,
let's
actually
say
owners
should
be
allowed
to
edit
docs.
That's
it
I'm
going
to
hit
next
and
I'm
going
to
choose
the
asset
type
that
I
want
to
apply
the
policy
to
so
in
this
case,
it's
going
to
be
data.
Sets
I'm
going
to
choose
that
and
then
I'm
going
to
choose
the
asset
that
I
want
this
policy
to
apply
to.
So
I
can
either
search
for
a
particular
asset
right
or
I
can
just
say
all.
A
A
Finally,
we
get
to
the
third
final
screen
where
we
can
say
who
can
actually
do
this
and
you'll
see
right
away,
there's
three
kind
of
options
here
we
can
either
call
out
users
specifically,
so
I
can
say
data
hub
user
or
john
doe
or
whatever
we
can
call
out
groups
or
we
can
say
owners
right.
So
this
is
that
edge
based
predicate.
A
Finally,
I'm
just
going
to
save
this,
and
now
you
see
I
have
a
new
policy
right.
You
can
see
it's
in
an
active
state,
which
means
it
should
apply.
So
I'm
going
to
go
ahead
and
go
back
to
the
data
sets
as
you'll
notice
like
this
actually
isn't
owned
by
me.
I'm
logged
in
as
data
hub,
so
I'm
going
to
go
to
the
second
data
set
which
is
owned
by
me
and
I'm
going
to
attempt
to
update
the
documentation.
A
And
you'll
see
I
was
able
to
update
it
great
awesome,
so,
let's
actually
back
out
here
and
let's
try
to
update
a
data,
sets
documentation
that
I
don't
own
right.
So
I
don't
own
this
one,
I'm
going
to
come
in
here
and
say:
hey!
I
want
to
update
oops
looks
like
I'm
unauthorized
to
perform
that
and
that's
because
the
policy
doesn't
allow
me
to
do
that.
A
So
I'm
going
to
go
back
and
I'm
going
to
open
up
this
policy
again,
I'm
going
to
take
a
look
at
what
it
says
and
I'm
actually
just
going
to
deactivate
it
because
you
know.
Actually
I
want
to
revoke
this
policy,
so
I'm
going
to
go
ahead
and
click
deactivate
and
you'll
see
that
this
policy
is
now
in
an
inactive
state.
A
A
A
I'm
going
to
again
choose
data
sets
and
in
this
case
I'm
going
to
actually
look
up
a
particular
data
set.
So
I
want
to
say
that
I
should
be
able
to
you
know,
update
the
hdfs
data
set,
or
maybe
the
kafka
one
as
well,
so
I'll
select,
two
of
them
and
then
finally
I'll
select
a
privilege
in
this
case
editing
tags,
and
then
I
will
just
find
myself
datahub
and
I'll
save
it,
and
you
can
see.
We've
got
the
new
policy,
it's
in
the
active
state.
A
So
now
I
should
be
able
to
update
the
tags
for
this
hdfs,
one,
which
I
wasn't
able
to
update
in
the
initial
case.
So
let's
say
my
new
tag
again
see
if
I
can
add
it
looks
like
I
was
able
to
add
it.
I
can
remove
tags,
of
course,
because
I
have
full
control
over
editing
the
tags
all
right.
So
this
one
works
this
one's
deactivated
and
then
actually
sorry,
this
one's
deactivated
and
then
I'm
also
the
owner
of
this
one.
A
So
I
can
probably
add
a
tag
here
as
well:
awesome,
okay,
so
we've
we've
correctly
created
two
policies
and
now
finally
there's
the
the
final
thing
I
want
to
demo,
which
is
just
cleaning
up
policies.
So
there
are
cases
in
which
you
may
have
created.
You
know
a
policy
by
mistake.
What
you
can
do
there
is,
you
can
actually
just
come
in
delete
the
policy
right,
delete
the
policy
and
we're
back
to
state
zero.
A
So
this
is
in
a
nutshell,
what
policy
management
and
role
fine
grained
access
control
will
look
like
on
data
hub.
This
is
the
mvp
all
of
those
privileges,
the
assets
you
saw
both
metadata
privileges,
as
well
as
the
platform
privileges
will
be
supported,
basic
platform
privileges,
including
managing
policies,
managing
analytics
things
like
that.
Eventually
that
will
be
extended
to
include
things
like
managing
users
and
groups,
so
adding
groups
deleting
groups
things
like
that.
A
B
There's
one
question
about:
who
can
even
edit
policies
like
who
has
admin
privileges
on
even
the
ability
to
add
or
create
policies.
A
Yeah,
so
we
we
model
the
ability
to
manage
policies
as
a
platform,
privilege
right
and
so
by
default.
Data
hub
will
will
ship
or
launch
with
a
set
of
sort
of
immutable
policies,
and
those
immutable
policies
will
grant
the
ability
to
manage
policies
to
manage
analytics
to
that
core
super
user,
which
is
data
hub
today.
So
when
you
launch
a
fresh
instance
of
data
hub
that
data
hub
user
will
have
all
privileges
on
the
platform
and
that'll
be
sort
of
the
jump
off
point
from
which
you
can
create
additional
policies.
A
A
Quickly
talk
about
the
implementation
like
what's
going
on
here,
you
know
recently,
we
we've
moved
our
graphql
api
to
the
metadata
service,
so
that's
actually
where
a
lot
of
this
is
kind
of
occurring.
So
what
happens
when
a
request
comes
in?
A
So
one
is
on
a
cadence,
so
you
can
configure
it
to
be
syncing
every
two
minutes:
five
minutes
10
minutes
whatever
you'd
like
by
default,
it's
at
two
minutes
as
well
as
when
the
cache
becomes
stale.
So,
if
you
add
a
policy
or
edit
a
policy
state,
as
you
saw
in
this
demo,
we
will
actually
go
and
refetch
the
cache
and
and
reboot
the
cache,
and
so
that
gets
us
into
the
authorizer
itself.
A
This
key
component,
which
basically
maintains
that
cache
always
keeps
kind
of
the
latest
view
of
the
policies
as
well
as
makes
a
determination
at
you
know,
request
time
whether
to
allow
or
deny
a
particular
action,
and
it
does
so
by
exposing
an
api
that
takes
those
three
pieces
of
the
policy
that
we
had
talked
about
prior.
So
at
request
time,
the
invoking
code
will
pass.
You
know
an
actor
which
is
basically
the
user
principle
behind
the
request.
It'll
pass
the
groups
that
that
user
is
associated
with,
as
well
as
a
privilege.
A
A
So
it's
pretty
awesome
so
policies
in
in
practice.
We
we
want
policies
to
be
enabled
or
disabled
globally
at
deploy
time.
So
what
this
means
is,
you
can
continue
to
use
datahub
as
you're
using
it
today
where
there's
no
policies
and
anyone
on
the
platform
can
do
anything.
A
We
wouldn't
recommend
that
we
recommend
you
actually
do
start
using
the
policies,
because
they,
I
think
they'll
be
very,
very
helpful
to
make
sure
that
metadata
stays
clean,
but
by
default
again,
datahub
will
be
that
super
user,
which
will
be
seeded
with
irrevocable
kind
of
immutable
policies.
That
say
that
it
can
do
anything
and
so
it'll
be
on
the
operator
to
go
and
spawn
off
additional
policies
on
a
per.
You
know,
policy
basis
from
that
core
admin
account.
A
Finally,
I'll
just
talk
about
a
little
bit
about
you
know.
What's
on
the
horizon,
for
policies,
so
after
we
get
this
kind
of
first
code
pass
done,
we
want
to
release
a
policies.
V1
usage
guide,
that'll
talk
about
how
you
create
policies,
how
you
manage
them,
hopefully
it's
self-explanatory,
but
I
think
it
will
still
be
pretty
helpful
to
have
something
accompanying
a
feature.
This
big
we'll
also
look
at
supporting
additional
predicate
types,
especially
on
the
resource
itself.
A
So,
as
you
saw,
there's
mainly
users
and
groups
which
are
able
to
do
different
things,
we
have
had
some
requests
from
a
few
folks
that
this
layer
of
indirection,
which
is
commonly
called
a
role,
would
be
perhaps
useful,
so
we're
actually
looking
for
feedback
from
the
community
and
direction
from
the
community
to
understand
whether
that's
a
requirement.
That
really
is
something
we
need
to
take
into
account
here
with
this
system.
A
So
that's
that's
pretty
much
it
thanks
guys.
I
will
hand
it
back
to
srishanka.