The Full Stack (@full_stack_dl) on Piclur: FSDL Lecture 6: Deployment is now live!This lecture covers a critical step: getting your model into prod.The key message is similar to our philosophy in other parts of the ML workflow:Start simple, add complexity as you need it.<a style="text-decoration: none;" rel="nofollow" target="_blank" href="https://fullstackdeeplearning.com/course/2022/lecture-5-deployment/">fullstackdeeplearning.com/course/2022/le…</a> / Piclur

The Full Stack

Sep 6, 2022

FSDL Lecture 6: Deployment is now live! This lecture covers a critical step: getting your model into prod. The key message is similar to our philosophy in other parts of the ML workflow: Start simple, add complexity as you need it. fullstackdeeplearning.com/course/2022/le…

full_stack_dl's tweet image. FSDL Lecture 6: Deployment is now live!

This lecture covers a critical step: getting your model into prod.

The key message is similar to our philosophy in other parts of the ML workflow:

Start simple, add complexity as you need it.

fullstackdeeplearning.com/course/2022/le…

The Full Stack

@full_stack_dl

Sep 6, 2022

When it's time to deploy, the first step is to create a prototype you and your friends / teammates can interact with. @Gradio, @huggingface, and @streamlit are your friends at this stage. You do want this to have a basic UI and be hosted behind a webserver to reduce friction.

full_stack_dl's tweet image. When it's time to deploy, the first step is to create a prototype you and your friends / teammates can interact with.

@Gradio, @huggingface, and @streamlit are your friends at this stage.

You do want this to have a basic UI and be hosted behind a webserver to reduce friction.

The Full Stack

@full_stack_dl

Sep 6, 2022

This is an example of the model-in-service deployment paradigm, where you just embed your model in your webserver. It's simple to implement, but will run into issues as you scale because models and web servers scale differently.

full_stack_dl's tweet image. This is an example of the model-in-service deployment paradigm, where you just embed your model in your webserver.

It's simple to implement, but will run into issues as you scale because models and web servers scale differently.

The Full Stack

@full_stack_dl

Sep 6, 2022

Once you run into some of these limitations, it's time to pull the model out of your web server. At a high level, there are two ways to do this. The first is batch prediction, where you run your model periodically on all possible inputs, and store the results in a database.

full_stack_dl's tweet image. Once you run into some of these limitations, it's time to pull the model out of your web server.

At a high level, there are two ways to do this.

The first is batch prediction, where you run your model periodically on all possible inputs, and store the results in a database.

The Full Stack

@full_stack_dl

Sep 6, 2022

Batch prediction is simple to implement, scales really well, has low latency, and has used in production for years by top companies in large-scale systems. However, - It doesn't work if you have a large universe of inputs, like in many use cases - Predictions quickly get stale

The Full Stack

@full_stack_dl

Sep 6, 2022

The second way to pull the model out of the web server is to run it as a separate service. This is the right answer for most ML use cases. It lets you scale & manage it separately and reuse it across apps. The tradeoff is added latency & infra complexity.

full_stack_dl's tweet image. The second way to pull the model out of the web server is to run it as a separate service.

This is the right answer for most ML use cases. It lets you scale & manage it separately and reuse it across apps.

The tradeoff is added latency & infra complexity.

The Full Stack

@full_stack_dl

Sep 6, 2022

In the lecture, we cover some of the main problems you may need to solve in the course of building and scaling your model service: - Managing dependencies - Optimizing the model's performance (GPUs?) - Scaling the service horizontally - Rolling out new versions

The Full Stack

@full_stack_dl

Sep 6, 2022

Since we're MLEs, not infra engineers, it rarely makes sense to solve all of these problems ourselves. Serverless options (like AWS Lambda) handle scaling out-of-the-box. They're well suited to a wide range of ML applications and are our default recommendation

The Full Stack

@full_stack_dl

Sep 6, 2022

If you want a simpler developer experience or more deployment / scaling features out of the box, there are managed options. Sagemaker is a good first thing to try if you're on AWS. There are also a range of startups, some of which provide more features like serverless GPUs.

full_stack_dl's tweet image. If you want a simpler developer experience or more deployment / scaling features out of the box, there are managed options.

Sagemaker is a good first thing to try if you're on AWS.

There are also a range of startups, some of which provide more features like serverless GPUs.

The Full Stack

@full_stack_dl

Sep 6, 2022

The last deployment paradigm we cover is deploying to the edge. Sometimes edge is your only option, like if you're deploying to a device with no internet. Edge also minimizes latency, is great for security, and scales well because users bring their own compute.

full_stack_dl's tweet image. The last deployment paradigm we cover is deploying to the edge.

Sometimes edge is your only option, like if you're deploying to a device with no internet.

Edge also minimizes latency, is great for security, and scales well because users bring their own compute.

The Full Stack

@full_stack_dl

Sep 6, 2022

However, edge deployment is still an immature part of the stack, and it comes with pretty significant tradeoffs: - Edge devices have limited resources - Edge frameworks are immature - It's difficult to update models - It's difficult to get data back for debugging or retraining

The Full Stack

@full_stack_dl

Sep 6, 2022

If you do deploy to the edge, we recommend the following mindsets: - Choose architecture with target hardware in mind - Iterate locally, but don't make big changes without verifying on-device - Test models on production hardware - Build in fallbacks for failures / latency

The Full Stack

@full_stack_dl

Sep 6, 2022

To summarize, you only see if your model actually works after you deploy it, so deploy early, and deploy often! Check out the lecture if you want to learn more about deploying models to production. fullstackdeeplearning.com/course/2022/le…

full_stack_dl's tweet image. To summarize, you only see if your model actually works after you deploy it, so deploy early, and deploy often!

Check out the lecture if you want to learn more about deploying models to production.

<a style="text-decoration: none;" rel="nofollow" target="_blank" href="https://fullstackdeeplearning.com/course/2022/lecture-5-deployment/">fullstackdeeplearning.com/course/2022/le…</a>

Josiah Adesola

@_JosiahAdesola

Sep 6, 2022

Accurate😅😅

United States Trends

1. Packers 98.7K posts
2. Eagles 128K posts
3. Jordan Love 15.3K posts
4. #WWERaw 133K posts
5. Benítez 12.5K posts
6. LaFleur 14.5K posts
7. AJ Brown 7,038 posts
8. Smitty 5,572 posts
9. McManus 4,403 posts
10. Jaelan Phillips 7,963 posts
11. Jalen 24.1K posts
12. Kevin Patullo 6,971 posts
13. Sirianni 5,050 posts
14. Grayson Allen 3,953 posts
15. #GoPackGo 7,951 posts
16. James Harden 1,902 posts
17. Berkeley 58.6K posts
18. Veterans Day 30.4K posts
19. Cavs 12K posts
20. Vit Krejci N/A

Something went wrong.