►
From YouTube: NEW! DataHub Actions Framework
Description
John Joyce (Acryl Data) shares the new Actions Framework for developing & deploying real-time outbound integrations with DataHub during the April 2022 Town Hall.
Learn more about DataHub: https://datahubproject.io
Join us on Slack: http://slack.datahubproject.io
Follow us on Twitter: https://twitter.com/datahubproject
A
All
right
so
just
one
disclaimer,
my
apartment,
complex,
just
mentioned
they're
gonna,
be
running
fire,
drills
or
fire
alarms,
so
just
disclaimer.
Hopefully
that
doesn't
happen
during
the
demo.
All
right,
I'm
gonna
talk
about
data
datahub
actions
framework,
which
is
a
project
that
we've
been
working
on
for
the
past
few
months
at
acryl
and
I'm
going
to
get
right
into
it
by
starting
with
a
brief
history
of
getting
data
into
datahub.
A
Now,
if
you
were
in
the
data
hub
community
around
january
2021,
you
would
have
probably
noticed
there
were
a
lot
of
questions
like
this
in
our
chat.
How
do
I
get
data
into
data
hub
and
at
that
time
our
response
was
pretty
much
this
I
don't
know
write
some
scripts.
We
have
some
example
python
scripts
you
can
use.
We
have
some
example
java,
but
you're
kind
of
on
your
own
come
february.
2021
we've
decided
to
actually
answer
build
an
answer
to
that
question,
which
was
the
metadata
ingestion
framework.
A
Since
then,
we've
seen
how
powerful
it
is
to
kind
of
break
things
down
into
simple
standardized
abstractions
we've
gotten
myriad
of
different
contributions
to
the
ingestion
framework,
since
we
rolled
this
out
even
to
some
sources
that
the
team
probably
hadn't
ever
heard
of
before
fast
forward
to
today
and
we're
getting
a
lot
of
new
types
of
questions.
For
example,
how
do
I
create
a
jira
ticket
when
something
happens
on
data
hub?
A
A
So
what
we
did
is
we
kind
of
leaned
into
these
questions
and
we
found
that
there
was
a
theme
which
was
getting
data
out
of
datahub,
particularly
in
real
time,
and
we
narrowed
down
to
a
few
different
categories
that
we
saw.
People
asking
about.
The
first
was
around
workflow
integration.
A
A
A
A
A
So
I'm
going
to
start
by
just
walking
through
a
hello
world
of
using
the
actions
framework.
It
all
starts
with
installing
the
datahub
actions
cli.
A
You
can
do
that
by
just
tip
installing
acro
data
hub
actions
very
similar
to
how
you
install
actual
data
hub.
I've
already
done
that,
so
I'm
going
to
skip
this
step.
The
second
step
is
to
define
what
we're
calling
an
action
configuration
file
so
similar
to
an
ingestion
recipe.
This
is
a
way
to
tell
datahub
what
you
want
to
do.
A
A
It
also
takes
some
configuration
which
allows
you
to
print
the
events
in
uppercase
or
lowercase
just
for
fun,
let's,
let's
start
with
uppercase,
so
once
we've
defined
this,
we
can
now
move
on
to
the
cli
and
actually
run
the
datahub
action.
A
A
A
You'll
see
that
there
are
actually
two
types
of
events
that
we're
getting
and
I'll
talk
about
these
a
little
bit
more
in
depth
once
we
get
back
to
the
presentation,
but
we
have
a
metadata
change
log
event
and
we
have
an
entity
change
event
coming
in
so
right
now.
This
action
is
just
printing,
pretty
much
everything
that's
happening
on
data
hub.
What
we
can
do
inside
of
this
configuration
file
is
actually
filter
down.
To
only
invoke
that
hello
world
action
on
events
that
we
care
about.
A
With
my
new
configuration
and
now
you'll
see
that
if
I
remove
this
pii
tag,
nothing
comes
up,
but
if
I
add
this
back
in,
of
course,
we
get
the
event
you
can
also
do
or
conditions.
So
maybe
I
care
about
a
tag
being
added
or
removed,
so
maybe
anytime,
a
pii
tag
is
changed.
It's
really
important
for
me,
so
I'm
gonna,
you
know,
do
an
or
condition,
and
what
this
will
allow
me
to
do
is
now
capture
events
that
represent
removals
and
additions.
A
Awesome.
Okay,
so
this
is
super
interesting,
maybe
not
so
useful,
though,
because
all
we're
doing
is
printing
hello
world.
A
So
what
I'm
gonna
show
you
quickly
is
just
what
it
looks
like
to
build
a
custom
action
to
do
something
that
you
want
it
to
do,
maybe
send
an
email
or
audit
log
or
whatever
else,
and
this
all
starts
with
a
simple
interface.
That's
called
action
to
implement
a
custom
action
is
really
a
matter
of
just
extending
this
base
interface
and
implementing
a
simple
act
method.
This
act
method
is
invoked
by
the
framework
whenever
an
event
comes
in,
in
this
case
we're
going
to
just
print,
but
in
reality
you
would
probably
do
something
really
important.
A
Once
we
do
that,
we
can
run
the
custom
action
and
you'll
see
that
it
will
print
out
a
bunch
of
events.
This
is
because
this
action
is
stateful.
It
actually
tracks
where
it
left
off
in
the
audit
log,
and
so
it
has
some
catching
up
to
do
once
it
starts
up.
That's
the
first
thing:
it'll
do
all
right
so
that
pretty
much
rounds
out
the
demo,
I'm
going
to
go
back
to
the
slides.
A
All
right,
quick
recap
of
the
quick
start
for
folks
who
maybe
just
want
to
reference
it
or
who
couldn't
make
the
talk
today.
Install
data
hub
actions,
configure
an
action
and
run
it
you'll
see
action
pipeline
with
name
x
is
now
running.
If
it's
successful
custom
actions
is
a
simple
matter
of
implementing
the
action
interface
and
then
running
it
now.
The
events
that
you
saw
coming
in
were
of
two
types,
and
these
will
both
be
included
in
the
first
release
of
actions.
The
first
is
entity
change
event.
A
This
is
a
high
level
event
which
is
emitted
when
important
changes
are
made
to
a
data
hub
entity,
for
example,
tags
or
added
or
removed
glossary
terms
are
added
or
removed.
Even
data
set,
schema
fields
are
added
or
removed,
and
finally
domains,
there's
also
many
more
descriptions
and
and
the
list
kind
of
goes
on,
but
these
are
probably
the
most
critical
and
then
number
two
is
the
metadata
change
log
event.
A
A
A
Now,
as
usual,
I
like
to
take
a
look
under
the
hood
to
talk
about
how
this
actually
works
and
really
it
starts
with
a
few
fundamental
concepts,
an
event
which
is
a
data
object
which
should
be
processed
by
the
framework,
an
event
source
which
is
a
source
of
events.
In
our
case,
the
event
source
is
coming
from
kafka,
a
transformer
which
is
a
transformer
of
events
or
a
filter
of
events.
That
filter
block
is
actually
a
transformer,
an
action
which
takes
action
on
events,
a
pipeline
which
manages
the
coordination
among
these
different
components.
A
Essentially,
it
manages
the
life
cycle
of
an
event
and
then
a
pipeline
manager
which
allows
you
to
manage
multiple
pipelines
running
in
parallel
in
the
same
process.
So
at
a
glance,
this
is
kind
of
how
things
look.
You
have
a
source
which
produces
events.
You
have
a
set
of
transformers,
including
filtering,
and
then
you
have
an
action
which
takes
action
on
the
event.
A
Some
notable
capabilities
I
wanted
to
call
out
is
that
we
will
support
distributed
actions
from
the
very
beginning.
This
means
you'll
be
able
to
load
balance
among
actions
instances
as
long
as
the
configuration
is
the
same.
We
achieve
this
because
we
use
kafka
consumer
groups
under
the
hood,
which
allows
you
to
load
balance
among
a
single
stream.
A
What
this
means
is
making
sure
that
the
inputs
and
the
outputs
of
each
transformer
and
each
action
are
what
it's
expected
failed
event
replay.
So
currently
we
log
the
events
to
a
failed
events
file,
but
we
don't
have
a
mechanism
to
yet
load
that
file
and
replay
it
through
the
entire
action
pipeline
asynchronous
event
commits.
Currently
we
support
synchronous
acting
after
an
action
has
actually
processed
an
event
which
is
conducive
to
kafka
and
more
filter
types
so
being
able
to
filter
by
a
regex
pattern
or
something
more
dynamic
than
a
simple
exact
match.
A
This
is
a
call
to
action.
We
definitely
need
your
help
as
the
community
to
make
this
framework
a
success,
so
we're
going
to
be
accepting
contributions
on
pretty
much
everything
from
the
core
framework
to
the
actions
and
transformers,
and
with
that
I
think,
that's
the
presentation.
So
thank
you
very
much
and
I'll
hand
it
back
to
maggie.