How to Host Label Studio on AWS Elastic Beanstalk with RDS PostgreSQL

2022-11-03

‍

How to Host Label Studio on AWS Elastic Beanstalk with Persistent Storage and RDS PostgreSQL

‍

By Hyuntaek Park

Senior full-stack engineer at Twigfarm

‍

At Twigfarm, we wanted to host the Label Studio on the AWS Elastic Beanstalk. I googled it for how and found only one article: https://medium.com/@KerleIndia/how-to-host-label-studio-on-aws-elastic-beanstalk-41abcbcced4

‍

It provides a simple method to host the Label Studio on the AWS Elastic Beanstalk but with data persistence problems. The storage is only temporary; login status and files are gone within a few hours.

‍

An alternative solution is to use Heroku. With just a few clicks you can easily deploy the Label Studio application on Heroku. However, we want to control the database and files directly. Thus, we could not give AWS Elastic Beanstalk option.

‍

In this tutorial, I go over how to deploy the Label Studio application on AWS Elastic Beanstalk while keeping persistent storage and separating the database. Custom domains and HTTPs settings are beyond the scope of this tutorial. You can refer to this link for some hints: https://io.twigfarm.net/aws/https-ec2/

‍

Architecture

Here is a simple diagram of how we host the Label Studio is deployed on AWS Elastic Beanstalk setup.

One thing to note is that here I am using the instances that are as small as possible for the demo purpose. You might need to set up with larger instances and load balancers correctly if you are trying to do the production deployment.

‍

Let us start by setting up the RDS Postgres first.

‍

AWS RDS Postgres Setup

Go to RDS and create a PostgreSQL database. Leave the settings as default except for the followings:

Choose PostgreSQL
Set username
Set the master password
Confirm password
Initial database name in Additional configuration (Important!): I used “labelstudio_db”

Then wait for a while until the database creation gets finished.

‍

Docker Build

Go ahead and clone the Label Studio GitHub Repository.

git clone https://github.com/heartexlabs/label-studio.git label-studio-eb

‍

ECR Setup

Now that your docker image is ready. Now it is time for the repository setup for the docker image.

Go to “Amazon Elastic Container Services (ECS)”, then choose Repositories under Amazon ECR on your left panel in the AWS console.

Click “Create repository”.

Repository name: labelstudio

Click “Create repository”.

Once the repository is created, click on “View push commands” and type in the commands as the following:

The following are done sequentially.

Login
Docker build
Tag
Push

After the successful push, you should be able to see the image that was just pushed.

‍

IAM role to connect to ECR from the Elastic Beanstalk

Go to “IAM” in AWS console. Choose “aws-elasticbeanstalk-ec2-role” under Access mangement > Roles. Add “AWSAppRunnerServicePolicyForECRAccess” policy.

‍

EFS

Go to EFS (Elastic File System) in the AWS console. Create a file system by clicking the “Create file system” button. Leave all the settings as they were. We create one with the “Standard” storage class.

Make a note of the “File system ID”, which we will need in the next section.

‍

Elastic Beanstalk Deployment Files

We need to create four files:

.ebextensions/01_mount.config # For EFS mount
.platform/00_nginx.config # For nginx settings
.platform/nginx/conf.d/proxy.conf # For unlimited file size upload

‍

EFS mount

Persistent storage is very important since we upload source files and create output files that they should be retained over time and across the users. I chose EFS is pretty consistent and automatically scales.

First of all, create the following directory and a file.

‍

nginx configuration

We often deal with large files with more than 10 mb in the Label Studio. We manually need to override default nginx settings so that we can upload large-size files. We need two files to achieve the goal.

You can consult more details about Elastic Beanstalk platform extensions in this link: https://docs.aws.amazon.com/elasticbeanstalk/latest/dg/platforms-linux-extend.html

‍

Dockerrun.aws.json File

‍

‍‍‍Upload

Now that our Elastic Beanstalk configuration files are ready. Let zip them into one file: deploy.zip

Almost done. Let us create an Elastic Beanstalk application to deploy the system.

‍

Elastic Beanstalk Creation

Go to Amazon Elastic Beanstalk Environments in the AWS console. Start by clicking “Create a new environment”.

environment tier: Web server environment
Application name: labelstudio-eb
Platform: Docker
Platform branch: Docker running on 64bit Amazon Linux 2
Platform version: 3.4.17 (Recommended)
Application code: Upload your code
Source code origin: Local file
Choose file: deploy.zip

It is not over yet. Click the “Configure more options” button to set environment variables for the database. Find the “Software” section and click the “Edit” button. Then scroll down to input environment variables under “Environment properties”

‍

Networking Setup

Choose the same VPC as the one That EFS is in.

‍

Instance Security Group Setup

Choose the default security group since the default security group allows all the incoming traffic. To be more specific, you should choose the security group that opens the 2049 port to connect with the EFS.

‍

Environment Variable Set

POSTGRE_USER: <DB_USERNAME>
POSTGRE_PASSWORD: <DB_PASSWORD>
POSTGRE_HOST: <HOST_URL> # Find it under RDS > Connectivity & security > Endpoint
POSTGRE_PORT: 5432 # Find it under RDS > Connectivity & security > Port
POSTGRE_NAME: labelstudio_db
DJANGO_DB: default

Click the “Save” button then “Create environment” button. This will take a few minutes.

Now the environment is ready. Click the “Go to environment” button to see if the application works well.

‍

전체 목록 보기

다음 노트 살펴보기

WORKS note

프리미어 그래픽 자막을 유튜브 자막으로 가장 편하게 변환하는 방법 (CSV to SRT 컨버터)

2025-01-08

LETR note

구글 제미나이와 레터웍스 페르소나 챗봇 비교

2024-12-19

LETR note

콘텐츠 제작의 패러다임 혁신 - AI 더빙 기술의 현재와 미래

2024-12-12