From ca31db97a467bfabb6050d4ef41ec63e0552499d Mon Sep 17 00:00:00 2001 From: xichen Date: Thu, 26 Jan 2017 14:56:22 -0800 Subject: [PATCH 1/2] added region check step, route53 config. --- doc/howto/usage/k8s/k8s_aws_en.md | 39 ++++++++++++++++++++++++++----- 1 file changed, 33 insertions(+), 6 deletions(-) diff --git a/doc/howto/usage/k8s/k8s_aws_en.md b/doc/howto/usage/k8s/k8s_aws_en.md index a6422b9be0..227b6dee86 100644 --- a/doc/howto/usage/k8s/k8s_aws_en.md +++ b/doc/howto/usage/k8s/k8s_aws_en.md @@ -1,4 +1,9 @@ # Kubernetes on AWS +## Choose AWS Service Region +This tutorial requires several AWS services work in the same region. Before we create anything in AWS, please check the following link +https://aws.amazon.com/about-aws/global-infrastructure/regional-product-services/ +Choose a region which has the following services available: EC2, EFS, VPS, CloudFormation, KMS, VPC, S3. +In this tutorial, we use "Oregon(us-west-2)" as example. ## Create AWS Account and IAM Account @@ -35,7 +40,8 @@ Here we will show you step by step on how to run PaddlePaddle training on AWS cl #### kube-aws [kube-aws](https://github.com/coreos/kube-aws) is a CLI tool to automate cluster deployment to AWS. - +##### Verify integrity of kube-aws downloaded +Note: if you are using a non-official release (e.g RC release) kube-aws, you can skip this setp. Import the CoreOS Application Signing Public Key: ``` @@ -60,7 +66,7 @@ PLATFORM=darwin-amd64 gpg2 --verify kube-aws-${PLATFORM}.tar.gz.sig kube-aws-${PLATFORM}.tar.gz ``` - +##### Install kube-aws Extract the binary: ``` @@ -201,14 +207,15 @@ paste into following inline policies: ] } ``` - +`Version` : Its value has to be exactly "2012-10-17". `AWS_ACCOUNT_ID`: You can get it from following command line: ``` aws sts get-caller-identity --output text --query Account ``` -`MY_CLUSTER_NAME`: Pick a MY_CLUSTER_NAME that you like, you will use it later as well. +`MY_CLUSTER_NAME`: Pick a MY_CLUSTER_NAME that you like, you will use it later as well. +Please note, stack name must satisfy regular expression pattern: [a-zA-Z][-a-zA-Z0-9*]*, which means no "_" or "-" in stack name, or kube-aws will throw error in later steps. #### External DNS name @@ -216,7 +223,7 @@ When the cluster is created, the controller will expose the TLS-secured API on a The A record of that DNS name needs to be point to the cluster ip address. -We will need to use DNS name later in tutorial. If you don't already own one, you can choose any DNS name (e.g., `paddle`) and modify `/etc/hosts` to associate cluster ip with that DNS name. +We will need to use DNS name later in tutorial. If you don't already own one, you can choose any DNS name (e.g., `paddle`) and modify `/etc/hosts` to associate cluster ip with that DNS name. We will find the ip to map to `paddle` in later steps. Also in this case, will have to add name service (route53) in aws cluster in later step. #### S3 bucket @@ -346,7 +353,27 @@ paddle-cl-ElbAPISe-EEOI3EZPR86C-531251350.us-west-1.elb.amazonaws.com. 59 IN A 5 In the above output, both ip `54.241.164.52`, `54.67.102.112` will work. -If you own a DNS name, set the A record to any of the above ip. Otherwise you can edit `/etc/hosts` to associate ip with the DNS name. +*If you own a DNS name*, set the A record to any of the above ip. Then you can skip to the step "Access the cluster". + +*If you do not own a DNS name*: +##### Update local DNS association +Edit `/etc/hosts` to associate above ip with the DNS name. +##### Add Route53 private name service in VPC + - Open Route53 web console + - Create hosted zone with following config + - Domain name: "paddle" + - Type: "Private hosted zone for amazon VPC" + - VPC ID: + - Add A record + - Click on the zone "paddle" just created + - Click the button "Create record set" + - Name : leave blank + - type: "A" + - Value: + - Verify name service + - Connect to any instance created by kube-aws via ssh + - Run command "host paddle", see if the ip returned is the private ip of kube-controller + #### Access the cluster From 3f02edeeeae64943cc652a865c600eec384e645f Mon Sep 17 00:00:00 2001 From: xichen Date: Fri, 27 Jan 2017 11:40:11 -0800 Subject: [PATCH 2/2] wording update after pr review --- doc/howto/usage/k8s/k8s_aws_en.md | 21 +++++++++------------ 1 file changed, 9 insertions(+), 12 deletions(-) diff --git a/doc/howto/usage/k8s/k8s_aws_en.md b/doc/howto/usage/k8s/k8s_aws_en.md index 46c4f5582d..ce5ccbca5e 100644 --- a/doc/howto/usage/k8s/k8s_aws_en.md +++ b/doc/howto/usage/k8s/k8s_aws_en.md @@ -3,12 +3,6 @@ We will show you step by step on how to run distributed PaddlePaddle training on AWS cluster with Kubernetes. Let's start from core concepts. -## Choose AWS Service Region -This tutorial requires several AWS services work in the same region. Before we create anything in AWS, please check the following link -https://aws.amazon.com/about-aws/global-infrastructure/regional-product-services/ -Choose a region which has the following services available: EC2, EFS, VPS, CloudFormation, KMS, VPC, S3. -In this tutorial, we use "Oregon(us-west-2)" as example. - ## Distributed PaddlePaddle Training Core Concepts ### Distributed Training Job @@ -50,6 +44,12 @@ We rank each pod by sorting them by their ips. The rank of each pod could be the ## PaddlePaddle on AWS with Kubernetes +### Choose AWS Service Region +This tutorial requires several AWS services work in the same region. Before we create anything in AWS, please check the following link +https://aws.amazon.com/about-aws/global-infrastructure/regional-product-services/ +Choose a region which has the following services available: EC2, EFS, VPS, CloudFormation, KMS, VPC, S3. +In this tutorial, we use "Oregon(us-west-2)" as example. + ### Create AWS Account and IAM Account Under each AWS account, we can create multiple [IAM](http://docs.aws.amazon.com/IAM/latest/UserGuide/introduction.html) users. This allows us to grant some privileges to each IAM user and to create/operate AWS clusters as an IAM user. @@ -80,7 +80,7 @@ Please be aware that this tutorial needs the following privileges for the user i #### kube-aws [kube-aws](https://github.com/coreos/kube-aws) is a CLI tool to automate cluster deployment to AWS. -##### Verify integrity of kube-aws downloaded +##### Verify kube-aws integrity Note: if you are using a non-official release (e.g RC release) kube-aws, you can skip this setp. Import the CoreOS Application Signing Public Key: @@ -265,10 +265,7 @@ When the cluster is created, the controller will expose the TLS-secured API on a DNS name should have a CNAME points to cluster DNS name or an A record points to the cluster IP address. - -We will need to use DNS name later in tutorial. If you don't already own one, you can choose any DNS name (e.g., `paddle`) and modify `/etc/hosts` to associate cluster ip with that DNS name. We will find the ip to map to `paddle` in later steps. Also in this case, will have to add name service (route53) in aws cluster in later step. -======= -We will need to use DNS name later in tutorial. +We will need to use DNS name later in tutorial. If you don't already own one, you can choose any DNS name (e.g., `paddle`) and modify `/etc/hosts` to associate cluster IP with that DNS name for your local machine. And add name service (route53) in aws to associate the IP to paddle for cluster. We will find the cluster IP in later steps. #### S3 bucket @@ -382,7 +379,7 @@ In the above output, both ip `54.241.164.52`, `54.67.102.112` will work. ##### Update local DNS association Edit `/etc/hosts` to associate above ip with the DNS name. ##### Add Route53 private name service in VPC - - Open Route53 web console + - Open [Route53 Console](https://console.aws.amazon.com/route53/home) - Create hosted zone with following config - Domain name: "paddle" - Type: "Private hosted zone for amazon VPC"