OctoML Featured Videos Open Meetings

24 Aug 2023

Tony Tzeng, Chief Product Officer, OctoML

Abstract:
Generative AI is suddenly the hottest workload on the planet. There isn’t a company on earth that isn’t working to build LLMs, audio, image, and video generation features into existing apps (or build new ones from scratch). If the hype around generative AI is an oncoming hurricane, product and engineering leaders are standing in the eye of the storm.

In this talk, OctoML's Chief Product Officer Tony Tzeng will share strategies for how to balance the competing priorities of speed, cost, and quality in AI deployments. This presentation is for leaders of AI, engineering, and product teams, from startups to enterprises.

You will learn:

- Methods for selecting the right model to work with (build vs. buy, open-source vs. closed)
- Why fine tuning & customization is the secret ingredient for killer generative AI apps
- The practical and cost benefits of hardware flexibility for AI workloads
- How to leverage model optimizations to improve user experience and reduce cost

1 participant
18 minutes

ai

octo

ml

foreo

creators

lisa

complexity

machine

elisa

hi

15 Aug 2023

Learn the three main techniques to customize your image generation models.

5 participants
18 minutes

coffee

espresso

breakfast

important

checkpoint

users

conversation

stable

businesses

seattle

15 Aug 2023

Instantly transform any image with one prompt. Powered by OctoAI.

Filmed on 8/3 at Seattle Tech Week at OctoML HQ.

8 participants
47 minutes

octoi

octo

ai

showcasing

interactive

demo

designer

avatars

app

augmenting

22 Jun 2023

Introducing OctoAI, a self-optimizing compute service to run, tune, and scale models so you can focus on building AI-powered applications that wow your users.

OctoAI is currently in public beta and available to try at octoai.cloud.

11 participants
56 minutes

ai

autopilot

octo

application

startups

backend

v8

ml

innovation

enterprise

7 Apr 2023

Learn to unlock deep insights into model performance with OctoML's Profiler

In this session, you’ll learn to use the OctoML Profiler to gain insight into the end-to-end performance of a prediction function on popular CPUs and GPUs using different efficient compilation techniques. This includes PyTorch 2.0 native compiler, TorchInduction!

Use these insights to optimize your deep learning application and right-size your hardware selections to avoid costly over-provisioning. The best part? It can all be completed from your development environment.

An and Ben will conduct a live tutorial using Google’s Flan-T5 transformer model. They’ll demonstrate how to add a few lines of code to any PyTorch inference to see how an optimized version of that code will perform on popular GPU and CPU hardware targets. By the end, you’ll have the skills to profile your own PyTorch models so you can confidently select the right model, software configuration, and hardware for your AI application.

Try Profiler yourself by signing up here https://profiler.app.octoml.ai/
Here are the GitHub instructions to get started https://github.com/octoml/octoml-profile

2 participants
32 minutes

ai

profiling

complexity

tuning

robust

insights

detection

models

leveraging

octoml

2 Nov 2022

OctoML CLI 🔎
https://bit.ly/OctomlYOLO

YOLOv5 Model Deployment Tutorial 🔎
https://github.com/octoml/octoml-cli-tutorials/tree/main/tutorials/yolov5

YOLOv5 Object Detection Model Deployment to Docker Desktop (Full tutorial)

Object Detection is a highly popular use case of the famous YOLO computer vision model. It is widely used in many industries and its popularity is continuing to grow. In this video we will be deploying the YOLOv5 machine learning model and pass input images and videos into it. In the end we will be able to get output videos and images with object detection bounding boxes.

This follow-along tutorial is designed to help you quickly get YOLOv5 computer vision models deployed to your local computer for inference. Below you’ll be introduced to the OctoML CLI, a free command line utility that packages machine learning models into deployable Docker containers with NVIDIA Triton Inference Server. When you’re ready to deploy to production, OctoML CLI can also be used to accelerate and deploy YOLOv5 to over 100 instance types in AWS, Azure and GCP.

🔑 TIMESTAMPS
================================
00:00 - Intro
00:55 - Download Docker
03:52 - YOLOv5 Model
07:35 - OctoML CLI
11:26 - Recap
12:17 - Model Inference

🚀 Join me on #100DaysOfML
================================
🛫 Start from day 0:
https://youtube.com/playlist?list=PLVBat3Ko2nN9z2L0izo1Reb3SmXdw7npz

👩🏽‍💻 LET'S CONNECT!
================================
🦄 Join my Discord server:
https://discord.gg/ZzUwYDjFwm

🛤️ Machine Learning Roadmap 2022 Website:
https://bit.ly/LearnML2022

📩 SUBSCRIBE to my machine learning newsletter:
https://learnml.substack.com/

🔔 SUBSCRIBE to my channel (It's FREE):
https://www.youtube.com/c/smithakolan?sub_confirmation=1

🎓 Connect with me on LinkedIn:
https://www.linkedin.com/in/smithakolan/

🤓 Follow me on Instagram:
https://www.instagram.com/smithakolan/

--------------------------------------------------------------------------
MORE VIDEOS:
--------------------------------------------------------------------------
📌I'm Starting My Machine Learning Company (Day 1)
https://youtu.be/lh_wyUrjS9k

📌Top Machine Learning Certifications For 2021
https://youtu.be/YhXzUZGKhIY

📌Why You Should NOT Learn Machine Learning!
https://youtu.be/reY50t2hbuM

📌How I Learnt Machine Learning In 6 Steps (3 months)
https://youtu.be/OuC3wgp1Fnw

📌How To Learn Machine Learning For Free
https://youtu.be/QNKYKzTGerA

1 participant
14 minutes

docker

deploying

yolo

git

downloading

image

helper

app

octo

export

11 Oct 2022

Did you know that over 80% of Machine Learning models never make it into production? But it doesn’t have to be this way!

In her talk, Automated ML Deployment: A Single Stack for Hardware Independence and Maximization of Performance Per Cloud Dollar, Vanessa Yan, Staff Product Manager at OctoML, will teach you how to automate the work away.

For more information on #YOLOVISION22 please visit: https://ultralytics.com/yolo-vision

Ultralytics ⚡ resources
- About Us – https://ultralytics.com/about
- Join Our Team – https://ultralytics.com/work
- Contact Us – https://ultralytics.com/contact

YOLOv5 🚀 resources
- Vision API – https://ultralytics.com/yolov5
- GitHub – https://github.com/ultralytics/yolov5
- Wiki – https://github.com/ultralytics/yolov5...
- Tutorials – https://github.com/ultralytics/yolov5...
- Docs – https://docs.ultralytics.com

1 participant
24 minutes

workloads

gpu

throughput

cpus

optimize

deploying

ai

enterprise

difficulties

aiml

21 Jun 2022

See how you can use the new OctoML CLI tool to develop and deploy intelligent applications faster. Take a trained ML model from popular frameworks like TensorFlow, Pytorch, and ONNX and deploy it into your app in three easy steps.

Try the OctoML CLI and TransparentAI on your own at: https://try.octoml.ai/cli.

0:00 How Developers Use The OctoML CLI

1:40 Local Deployment

3:47 Acceleration

5:10 Cloud Deployment

1 participant
8 minutes

backend

docker

app

tensorflow

kubernetes

ai

microservices

launch

cli

octo

21 Jun 2022

See how IT Operations teams can accelerate and deploy ML models into production, at scale. This demo shows how to use the @OctoML CLI to improve the reliability, performance, and cost efficiency of your ML deployments.

Try the OctoML CLI yourself at try.octoml.ai/cli

0:00 How IT Ops Uses OctoML CLI

1:09 Improve Reliability

3:30 Improve Performance

5:22 Improve Cost Efficiency

7:00 Deploy to Kubernetes

1 participant
10 minutes

kubernetes

transparent

workflows

ai

docker

processing

cloud

software

models

octomo

14 Jan 2022

Matthai Philipose from Microsoft speaks about the work his team did in partnership with OctoML for flexible, bulk video analysis (millions of hours of video and billions of images analyzed per month). At this large scale, inference is a significant portion of the total compute cost and the team is working to make inference super-efficient. Microsoft ran experiments to optimize key ML models with TVM, varying input size, batching and processor targets and compared the inference throughput against the production baseline. The results showed 1.2 - 3x higher throughput typically when optimizing with TVM. Microsoft is now planning on moving TVM optimized models into production.

- - -
Recorded at TVMCon (https://www.tvmcon.org), the Machine Learning Acceleration Conference in Dec 2021.

TVMCon covers the state of the art of deep learning compilation and optimization, with a range of tutorials, research talks, case studies, and industry presentations. We discuss recent advances in ML frameworks, compilers, systems and architecture support, security, training and hardware acceleration.

Connect with us for the latest in ML Acceleration and Deployment:
Website: https://octoml.ai
LinkedIn: https://www.linkedin.com/company/octoml
Twitter: https://twitter.com/OctoML

1 participant
6 minutes

efficient

optimizing

throughput

tvm

analyze

functionality

bulk

processor

watch

gpu

10 Jan 2022

Day 2 Keynote from TVMCon (Dec 2021), the Apache TVM and Open Source ML Acceleration Conference.
SPEAKERS: Luis Ceze, Jason Knight, Vanessa Yan, Sameer Farooqui, Matthai Philipose
--
Learn about OctoML, a startup headquartered in Seattle, WA that is focused on making artificial intelligence faster and easier to deploy.

Product Demo features the OctoML Model Zoo with pre-accelerated extremely fast (sub-millisecond) ready-to-download computer vision and language models for both cloud and edge targets.

About OctoML:
OctoML was spun out of the University of Washington Paul G. Allen School of Computer Science & Engineering where a group of computer science experts worked on helping companies deploy machine learning models on varied hardware configurations. It led to the creation of the open-source ML deep learning compiler Apache TVM which has quickly become the defacto deep learning compiler used by companies like Amazon and Facebook.

++ Contents ++
00:00 - OctoML Company and Product Vision - by Luis Ceze
14:55 - OctoML Product Vision Intro - by Jason Knight
16:27 - OctoML Product Demo - by Vanessa Yan, Sameer Farooqui
24:17 - OctoML Product Vision Continued - by Jason Knight
35:17 - TVM-Optimized Video Analysis in Microsoft Watch For - by Matthai Philipose (Microsoft)
41:13 - Future of the OctoML Platform - by Jason Knight

- - -
Recorded at TVMCon (https://www.tvmcon.org), the Machine Learning Acceleration Conference in Dec 2021.

TVMCon covers the state of the art of deep learning compilation and optimization, with a range of tutorials, research talks, case studies, and industry presentations. We discuss recent advances in ML frameworks, compilers, systems and architecture support, security, training and hardware acceleration.

Connect with us for the latest in ML Acceleration and Deployment:
Website: https://octoml.ai
LinkedIn: https://www.linkedin.com/company/octoml
Twitter: https://twitter.com/OctoML

5 participants
45 minutes

ai

cvm

tvm

vm

automation

advance

ml

technology

aiml

manage

OctoML / Featured Videos

24 Aug 2023

15 Aug 2023

15 Aug 2023

22 Jun 2023

7 Apr 2023

2 Nov 2022

11 Oct 2022

21 Jun 2022

21 Jun 2022

14 Jan 2022

10 Jan 2022