►
From YouTube: Software Portal: Sources: 2021-07-30
Description
No description was provided for this meeting.
If this is YOUR meeting, an easy way to fix this is to add a description to your video, wherever mtngs.io found it (probably YouTube).
A
All
right,
this
is
the
first
step
in
implementing
the
crawler
for
the
sap
inner
search
portal.
So
we've
got
the
sap
has
developed
this
front
end.
That's
a
basic
html
front
end
for
the
for
an
inner
source
portal,
and
it
essentially
looks
like
this,
and
so
what
we're
gonna
do
here
is
we're
gonna
go
and
we're
gonna
implement
what
they
call
a
crawler,
and
so
this
front
end
is
static
and
it
pulls
data
from
this
repos.json
file.
A
Now
we're
going
to
essentially
implement
the
thing
that
you
know
puts
the
data
in
the
repos.json
file
and
they
got
a
little
diagram
here
of
how
that
works,
and
now
ours
is
going
to
be
a
slightly
modified
version
of
these
two
using
data
flows
and
operations
and
also
sources.
So
what
we're
going
to
do
is
you
know
we?
A
We
need
to
put
information
in
this
repos.json
file.
Our
input
data
is
essentially
a
it's.
This
tree
structure,
so
the
it
looks
like
it
looks
like
it's
a
it's
a
directory
and
within
that
directory
we
have
within
that
directory.
We
have
subdirectories,
which
are
org
names
of
github
orgs
and
within
those
directories
we
have
repos
dot,
yaml
files
and
in
those
repos
die
email
files.
We
have
the
name
of
each
repo
that
we
care
about
tracking
and
we
have
the
owners
of
that
and
we
could.
A
We
could
have
arbitrary
other
information
in
there
as
well.
So
let's
go
there.
This
is
what
it
looks
like
here.
So,
for
example,
we're
tracking
intel
here's
the
reapers.json
on
intel
and
then
we're
tracking
tpm2
software
and
here's
some
repost.json
there.
So,
let's
see
if
we
can
we'll
grab
one
of
these,
so
you
can
see
what
they
see.
Look
like
right
now,
so,
for
example,
right
so
here's
dfml,
right
and
and
I'm
the
owner
and
then
there's
cbe,
bentool
and
terry's
the
owner.
A
Obviously,
I've
just
picked
picked
one
person
here
to
simplify
things,
but
you
get
the
pictures.
So
that's
our
input
data
and
we're
going
to
be
pulling
repos
from
there
and
we
want
to
take
each
one
of
these
repos
and
we
want
to
generate
a
you
know.
A
We
want
to
generate
an
entry
in
the
sap
repos.json,
which
is
looks
like
this,
and
you
can
see
basically
it's
id
and
just
some
info
from
the
github
repo
and-
and
this
is
basically
what
they've
got
here
by
default
is
if
you
read
that
crawler's
document,
it's
just
what
happens
if
you
hit
the
github
api
for
search
for
repos,
and
so
this
is
the
basic
information
that
they're
asking
for
and
they
will
display
in
the
portal
so
we're
gonna
essentially,
and
then
they
have
this.
A
You
know
they
have
this
section
in
your
source
metadata
here,
which
is
where
they
add
extra
information
and
so
we're
going
to
add
some
extra
information
there.
We're
also
going
to
have
you
know
this
information
here
populated
the
standard
stuff
and
we're
gonna
do
that
by
so
we've
got
our
input,
which
is
these
repos
yaml
files,
and
we
got
our
output,
which
is
this
repos.json
file,
and
so
each
each
repo
is
maps
to
like
the
dfml
concept
of
a
record
and
which
is
a
you
know,
anything
that's
a
uniquely
identifiable.
A
So
with
github
repos
you
got
the
owner
and
the
repo
name,
and
that's
essentially
you
know
that's
your
unique.
You
can
uniquely
identify
any
repo
based
on
use
using
those
two
things
or
you
know,
for
example
like
the
url
here
right,
so
the
github
and
soul
earth
because
they're
doing
a
demo
with
planets
here.
A
So
the
first
step
is
really
just
you
know:
read
them
read
the
repos
in
that
we
care
about
and
then
write
them
out,
and
so
what
we
do
first
is
we
implement
the
source
to
read
the
repos
in
in
our
input
format,
which
is
the
directory
and
yaml
files,
and
we
also
implement
the
source
to
read
the
repos
or
to
write
to
the
repos.json
file
and
we're
actually
going
to
implement
you'll,
see
we
implement
the
the
repos
dot
json
reader
writer
first
and
then
we're
going
to
implement
the
one
for
the
for
the
directory
structure
with
the
animal
files
and
then
in
the
following
video.
A
What
we're
going
to
do
is
we're
going
to
go
through
and
we're
going
to
run
the
data
flow,
so
we're
going
to
create
a
data
flow
using
operations
and
those
operations
will
collect
the
pieces
of
data
that
we
want
to
have
in
the
output
repos.json,
which
we
want
to
display
on
on
our
inner
source
portal
and
we're
going
to
we're
going
to
so
we're
going
to
write
little
operations
that
are
going
to
collect.
Maybe
each
piece
of
data.
For
example,
we
might
have
an
operation
that
grabs
a
description.
A
We
might
have
an
operation
that
calculates
the
participation,
error,
grabs
the
logo
and,
for
example,
one
of
the
operations
we're
going
to
do
is
based
on
the
owner,
we're
going
to
generate
the
gravatar
url.
So
we
can
display
the
gravatory
url
if
you're
not
familiar
with
gravatar,
it's
go
check
it
out.
Basically,
you
can
add
your
picture
and
then
everywhere
on
the
internet
can
use
your
email
to
give
your
profile
picture
for
the
whole
internet.
Essentially
so,
okay,
so
yeah
we'll
implement
those
operations
and
we'll
show
how
using
a
data
flow.
A
We
can
leverage
different
data,
that's
generated
by
different
operations,
so
that
not
every
operation
has
to
do
everything.
For
example,
if
we
generate
if
we
grab
a
github
api
request,
that
gets
this
data.
Like
I
said
this,
is
you
know
the
data
you
get
just
from
doing
a
regular,
github
search
right?
We
might.
We
might
return
a
repository
object
that
has
that
minimum
this
data
and
then,
instead
of
each
operation,
you
know
each
other
operation
that
might
generate
some
metrics
or
some
data
that
would
go.
A
You
know
in
this
inner
source
metadata,
that's
the
additional
stuff
having
to
make
the
a
a
request
to
the
github
api
itself.
We
can
leverage
the
object
that
we
return
from
the
first
operation,
and
now
we
can
write
more
operations
that
you
know
maybe
calculate
things
or,
for
example,
generate
things
based
off
the
owner
email
address.
We
can
generate
the
the
gravatary
url
and
we
don't
have
to
you
know,
make
more
web
requests.
A
We
can
just
pass
around
the
same
objects,
and
so
this
means
that
the
authors
of
these
operations,
don't
necessarily
have
to
know
how
to
use
the
github
api.
They
may
just
know
that
hey
there's
this
object
floating
around
that
somebody
else
has
already
used
the
api
to
kit,
and
that
way
we
can
create
these.
You
know
large
tree
directed,
you
know
it's
a
large
directed
graph
of
how
the
data,
how
you
you
write
operations,
they
produce
data
and
you
can
leverage
the
data
produced
within
other
operations.
A
You
know
without
having
to
re-go
grab
that
data
all
right.
So,
let's
get
to
it.
We're
going
to
write
these
two
sources,
starting
with
the
repos.json
source,
which
is
the
output
source
and
we're
just
going
to
implement,
reading
and
writing,
because
it's
it's
very
easy
for
a
json
source
and
then
then
I'll
show
you
when
we
we
implement
the
one
to
read
the
repo
yaml
files
organized
under
a
specific
directory
tree
all
right.
So,
let's
get
to
it!
A
So
I'm
gonna,
I'm
gonna,
play
you
an
ascii
cast
of
of
how
I,
how
I
did
this
so,
and
I
know
this
is
4k,
but
so
you
might
have
to
zoom
in
all
right.
So,
okay,
this
is
gonna,
be
too
fast.
Isn't
it
all
right.
A
All
right,
so
what
I
did
here
is
I
started
from
the
json
source
itself
and
the
json
source
is
it's
it's
an
existing
source
that
that
has
that
stores
things
in
a
json
object.
Now
the
sap
format
is
actually
an
array.
A
So
what
we're
going
to
do
here
is
we're
going
to
subclass
from
we're
going
to
we're
not
going
to
actually
subclass
from
the
json
source,
we're
going
to
subclass
from
the
same
things
that
the
json
source
subclass
is
from,
which
is
the
file
source
and
the
memory
source
and
we're
going
to
with
the
memory
source
essentially
provides
us
with
a
dictionary
that
backs
all
our
objects
that
we've
loaded
into
memory
in
this
dictionary,
which
is
self.mem,
and
so
we
really
just
need
to
and
the
file
source
provides
us
with
the
on
open
it.
A
A
You
make
the
unique
key
that
at
the
html
or
the
html
url,
which
is
same
as
the
repo
url
and
we're
going
to
create
a
record
object
for
each
one
of
those
where
the
feature
data
is
that
example,
you
know
data
json
object
up
there
and
then,
when
we
dump
it
out,
we
just
dump
the
dictionary
that
we
have
in
memory
to
a
list,
and
you
know,
but
then
then,
when
we
load
it
back
in
right,
we
load
it
into
the
dictionary
keyed
off
the
off
the
url
and
so
we're
going
to
register
this
thing
as
an
entry
point.
A
We're
not
going
to
cover
that
quite
yet,
and
that
is
essentially
going
to
allow
us
to
a
shorthand
version
of
this
call
right
here.
So
right
now.
What
we're
going
to
do
is
we're
going
to
write
the
long
version
where
we
specify
the
full
path
to
the
class
which
we
want
to
use
to
list
the
sources,
and
so
you
can
see
sources,
sap,
porter,
repo
json
and
then
the
the
class
name
there.
A
And
then
we
provide
the
arguments
which
are
the
same
as
the
file
sources.
Config
arguments
which
is
essentially
you
know,
that's
the
file
name
to
load
from
and
and
that's
just
going
to
be,
our
json
file
that
we
give
so
and
here
you
can
see.
I
got
confused
because
I
had
these
relative
imports
that
I
didn't
delete.
A
So
now
we
deleted
them
and
now
we
can
go
and
we
can
run
the
we
can
run
the
run
the
source
here
to
run
that
command
line
example,
and
we
see
that
we
dump
the
data.
So,
let's
see
yeah
I
was
going
to
create
an
issue
and
then
I
didn't
alright.
So
here's
here's
the
dump
right.
You
do.
You
run
this
list
command
and
running
the
list.
A
Clear
man
instantiates
this
class
we're
using
this
file
name
as
as
the
as
the
you
know,
the
file
name
to
read
the
json
from
and
and
does
lists
all
the
records
in
it
which
we
populated
via
populating
that
self.mem
object.
So
now
we
just
copied
the
previous
file
to
the
new
file.
You
know
to
use
as
a
template
and
we're
going
to
go
through
and
implement
this
for
the
orgs.
A
So
this
is
like,
I
said,
the
directory
structure,
which
I
just
showed
earlier.
It's
got
orgs
and
then
it's
got
each
sub
directory
is
the
org
name
and
here's
the
contents
of
those
files.
It's
got
the
name
and
the
owners
which
is
a
list,
so
we
go
through
and
we're
just
gonna
list.
You
know
we're
gonna,
do
a
recursive
listing
of
all
the
yaml
files.
A
Our
config
object
is
just
gonna
take
the
directory,
which
is
the
top
level
orgs
directory,
and
so
then
every
yaml
file
we
find
the
org
name
is
going
to
be.
The
the
org
name
is
going
to
be.
The
the
org
name
is
going
to
be
the
subdirectory
name
and
then
the
you
know
we
load
all
the
yaml
documents
and
you
can
see.
A
Lingamble
documents
are
separated
by
the
three
three
dashes
there,
so
we
load
all
the
yaml
documents
in
each
repos.yaml
file
and
we'll
you
know
key
the
urls,
which
I
think
I
forgot
to
do
so
I'll.
Do
that
right
here
at
the
end,
we'll
key
we'll
we'll
key
the
records
based
on
you
know:
org
name,
slash,
repo
name
right,
and
so
here
what
we
do
is
we
can
just
implement
the
the
memory
source
we're
not
going
to
use
the
file
source
because
obviously
we're
reading
from
a
directory,
not
a
file.
A
So
what
we
do
is
those
load
that
load
fd
method.
It
actually
gets
called
on
the
a
enter
method,
which
is
a
context
entry
to
this.
This
source
object
and
you
can
read
about
that
on
the
double
context.
Entry
page.
So
everything
follows
this:
double
contract
entry
pattern
and
for
the
memory
source
here
you
you
can
the
the
context
is
handled
for
you,
so
you
only
have
to
do
the
top
level
context.
A
So
you
know
I'm
gonna
open
this
up
in
a
second
here,
so
because
now
it's
going
too
fast.
So
all
right.
So
let
me
bring
this
up
sources,
okay,
so
on
entry.
So
when
this
class
is
instantiated
per
the
double
context,
entry
documentation,
you
know
we're
going
to
do
this
double
contact
century
pattern,
and
this
is
the
first
context.
Entry
is
what
you're
seeing
here
in
this
a
enter
method.
A
And
so
now
we
can
use
this
as
an
opportunity
to
populate
the
self.mem
that
we
were
populating
before
and
so
in
in
the
previous
example
and
and
all
the
other
sources,
if
you
go
go
through
them,
if
they're
based
on
memory
source,
so
you
can
yeah
you
can
you
can
you
can
populate
that
self.mem
here
and
you
can
see
what
we're
doing
is
just
you
know,
recursively
listing
all
the
yaml
files
grabbing
them
and
then
setting
the
name,
and
so
we
have
to.
A
A
A
So
this
is
the
records
key,
which
is
the
unique
identifier
that
I
was
talking
about.
So
as
long
as
we
have
a
unique
identifier,
then
we
can,
you
know
we
can
use
the
source
construct.
So
now
we've
got.
You
know
the
key,
which
is
the
the
github.com
repo
name,
and
you
can
see
we'll
we'll
run
this
example
here
and
will
show
up
like
this
so
where's
my
there
we
go
so
rerun.
This
example
command
oops,
oh
and
I'm
not
in
that.
A
All
right
so
now
we
see
that
the
key
is
is
the
correct
set
of
values
here,
which
is
that
you
know
the
full
github
url
all
right
so
and
now
what
this
is
gonna
do
is
right.
So
what
we
can
do
now
is
is,
is
we'll
go
and
we'll
make
our
data
flows
and
we'll
make
it
so
that
you
know
just
like
the
list
command.
All
you
know
all
the
command
syntaxes
is
very
similar
here.
A
So
so
we're
going
to
do
a
data
flow
run
command
and
we're
going
to
use
this
orgs
thing
as
as
our
our
input
source
and
then
we're
going
to
run
the
data
flow
on
each
one
of
these
and
we're
going
to
produce
a
record.
That's
going
to
be
output
to
the
to
the
other
source
that
we
wrote,
which
is
that
records.json
source.
So
that's
what
we're
going
to
cover
in
the
next
video.