►
Description
Identifying Pokémon Cards by Hugo Peixoto
I want my Pokémon TCG inventory to be digitized so I can search my cards and know which ones I still need. After building a tool to manually enter the cards, I decided to explore Computer Vision algorithms to automate part of the process. In this talk we'll go over some common algorithms used in this area and the roadblocks that I hit while learning.
A
So
I
decided
to
fix
this.
I
started
by
building
a
website
where
I
could
go
and
manually
enter
how
many
cards
I
have
of
each
and
that
would
get
saved
to
a
database.
Now
this
worked
fine,
but
I
wanted
to
try
something
different.
I
wanted
to
have
a
webcam
pointed
at
my
desk
and
put
a
card
in
there
and
have
the
software
automatically
detect,
which
card
it
was
and
start
that
in
the
database.
A
A
Initially,
I
was
working
with
frames
like
this
one
on
the
left
and
you
can
see
that
there
are
some
some
shadows
and
the
wood
pattern
of
the
desk
causes
some
noise
as
well.
So
these
things
were
making
my
life
a
bit
harder
than
I
wanted.
So
I
moved
to
a
more
controlled
environment
like
the
picture
on
the
right
and
there
you
have
a
white
background
and
there
are
no
shadows
and
things
worked
much
better
under
these
conditions.
A
Now
I
mentioned
that
I
needed
a
data
set
so
there's
this
website
pokemon
cards.com,
and
they
have
pretty
much
all
the
cards
in
there
that
were
ever
printed
in
english.
So
since
they
don't
have
an
api.
What
I
did
was
I
built
a
ruby
script
that
just
scraped
the
whole
thing
and
downloaded
every
card
image
to
my
computer,
so
I
ended
up
with
about
14
000
cards.
A
Now,
let's
focus
on
detecting
and
extracting
the
card
image
from
the
frame
we're
working
with
1080p
color
images,
you
might
think
of
a
color
image
as
having
three
channels,
the
red,
the
green
and
the
blue
one.
Now.
What
I
found
is
that
most
computer
vision,
algorithms
only
really
care
about
the
brightness
of
the
pixels
or
how
light
or
dark
each
pixel
is.
This
is
equivalent
of
to
working
with
grayscale
images
and
that's
what
we're
going
to
do
now.
A
The
circle
operator
highlights
the
edges
of
the
image,
both
the
outline
of
the
card
and
the
edges
in
the
drawing
itself,
and
it
removes
any
sections
of
solid
color
since
we're
looking
for
the
boundaries
of
the
card.
This
helps
highlight
the
edges,
particularly
in
cases
where
the
card
doesn't
have
a
thick
black
border
like
this
one
around
it.
Some
cards
have
a
yellow
border,
and
this
helps
to
normalize
the
levels
a
bit.
A
Now,
how
does
the
simple
operator
work?
It's
a
kernel
based
image,
filter
and
what
that
means.
It's
a
image
filter
that
follows
a
specific
structure,
so
you
calculate
each
pixel
independently
of
each
other,
and
you
look
at
the
respective
pixel
on
the
source
image
plus
a
small
window
around
it
and
in
this
case
I'm
showing
a
three
by
three
window,
but
it
can
be
a
window
of
any
size.
A
A
A
So
if
we
apply
the
servo
operator
to
every
pixel
on
this
image,
we
end
up
with
something
like
this.
The
values
here
range
from
zero
to
around
a
thousand
and
to
better
understand.
What's
going
on,
let's
map
this
to
a
grayscale
image
where
the
zeros
become
white
and
the
thousands
become
black
here
you
can
see
that
the
corners
and
the
center
of
the
image
are
completely
white,
and
this
indicates
that
there's
no
edge
in
those
regions
while
around
the
circle
a
dark
ring
has
formed,
and
that
indicates
that
these
pixels
likely
contain
an
edge.
A
A
So
if
we
do
this
for
every
row
and
then
we
do
the
same
thing
from
the
other
side
and
from
the
top
to
bottom
and
bottom
to
top,
we
end
up
with
these
marked
pixels.
So
this
is
the
contour
of
the
image
and
this
works
because
the
the
card
doesn't
have
any
holes
or
any
concave
structures
or
anything
like
that,
and
once
we
have
this.
A
A
A
If
we
do
that,
we
get
our
intended
result.
So,
with
these
lines,
we
can
calculate
their
intersection
points.
We
can
just
do
this
by
going
through
every
pair
of
lines
calculate
their
intersections
and
discard
any
points
that
fall
outside
of
our
image,
and
these
four
points
what
they
represent
is
the
four
corners
on
our
card.
A
Now
that
we
have
the
corners
of
the
card,
we
can
work
on
fixing
the
perspective
to
get
something
like
this,
and
this
transformation
is
done
roughly
speaking
by
taking
it
each
pixel
and
moving
it
to
another
coordinate,
and
this
movement
is
done
by
multiplying
each
pixel's,
coordinate
by
matrix
obtained
by
solving
a
system
of
equations.
That's
based
on
those
four
corners
and
to
do
this,
I
used
a
crate
called
n-algebra.
A
A
We
can't
really
compare
them
directly,
since
there
are
14
000
cards,
this
would
take
forever.
So
we
need
to
reduce
the
amount
somehow
of
information
that
we're
comparing
and
we're
going
to
do
this
using
a
perceptual
hash
and
what
the
perceptual
hash
is
is
a
smaller
representation
of
the
image
that
still
keeps
the
the
essence
of
the
image.
A
A
A
With
that
we
can
fix
the
perspective
of
the
grayscale
image
and
apply
a
perceptual
hash.
Now
this
process
gave
me
pretty
good
results,
but
there
was
one
case
that
I
needed
to
deal
with
these
cards
over
here.
They
look
the
same.
They
have
the
same
name.
The
gameplay
effect
is
the
same,
but
there's
one
small
difference.
A
They
were
printed
in
different
sets
throughout
the
years
and
you
can
see
that
the
set
symbol
on
the
corner
there
is
different.
So
I
need
to
be
able
to
tell
these
apart.
The
set
symbols
are
so
small
that
the
perceptual
hash
doesn't
pick
up
any
differences
in
there.
So
I
needed
to
find
a
different
way
of
doing
this,
so
I've
limited
a
technique
called
template
matching.
A
A
A
So
let
me
show
you
guys
where
the
set
symbol
makes
a
difference
the
this
this
mario
here
it
belongs
to
the
champions
path.
Expansion
set
and
you'll
see
that
the
correct
symbol
is
shown
on
the
left.
There's
another
marni,
which
is
printed
in
a
different
set.
This
one
was
printed
in
the
sword
and
shield
base
set,
and
you
can
see
that
it
also
gets
detected
correctly.
A
Now
there
are
some
cards
where
this
doesn't
work
so
good
like
this
one,
it's
a
foil
card,
and
that
means
that
there's
a
lot
of
reflective
material
in
it,
so
the
lights
form
from
all
these
bright
patterns
and
the
perceptual
hash
just
isn't
able
to
to
deal
with
it
now
to
finalize
here
are
some
of
the
libraries
that
I
used
and
some
that
I
think
that
are
worth
worth.
Checking
out
the
the
last
one
rest
cv
is
an
organization
with
many
computer
vision,
algorithms.
A
So
if
you're
interested
in
the
area,
I
think
it's
worth
checking
out,
they
also
have
some
basic
tutorials
going
through
some
of
them,
and
the
code
for
this
is
available
on
my
github
account.
So
if
you
want
to
go
and
check
it
out,
feel
free
to
ask
me
any
questions,
and
that's
all
I
have
for
you
today,
so
thank
you
for
listening.