NLB를 통한 DB 접속 시 세션 끊김 현상
조금 특이한 구성인데 Database를 AWS NLB로 중계하여 접속할 경우 간혹 세션이 끊기는 현상이 있었다고 한다.
NLB로 DB 접속 시 발생하는 현상이라고 판단하고 DB로 직접 연결해 우회했다고 들었는데,
왜 이런 이슈가 발생했는지 혼자서 이슈를 재현해 보았다.
이슈요약
AWS Network Loadbalancer를 통해서 이중화된 DB를 중계할 때 간혹 세션이 끊기는 현상이 발생하였다.
이슈 재현 테스트
RDS나 완전 동기화된 DB로 테스트하진 않고 임의의 EC2에 동일한 PostgreSQL을 설치해서
아래의 구성으로 Cloudformation을 구성하고 테스트 해보았다.
Cloudformation 내용
# EC2에서 사용할 keypair 선택
Parameters:
KeyName:
Description: Name of an existing EC2 KeyPair to enable SSH access to the instances. Linked to AWS Parameter
Type: AWS::EC2::KeyPair::KeyName
ConstraintDescription: must be the name of an existing EC2 KeyPair.
# PostgreSQL 이슈 테스트를 위한 AWS 자원 선언
Resources:
# VPC 선언
DevPostgreSQLIssueTestVPC:
Type: AWS::EC2::VPC
Properties:
CidrBlock: 10.10.0.0/16
EnableDnsSupport: true
EnableDnsHostnames: true
Tags:
- Key: Name
Value: DevPostgreSQLIssueTestVPC
# Public Subnet A 선언
DevPostgreSQLIssueTestPublicSubnetA:
Type: AWS::EC2::Subnet
Properties:
VpcId: !Ref DevPostgreSQLIssueTestVPC
AvailabilityZone: !Select [ 0, !GetAZs '' ]
CidrBlock: 10.10.0.0/24
Tags:
- Key: Name
Value: DevPostgreSQLIssueTestPublicSubnetA
# Public Subnet C 선언
DevPostgreSQLIssueTestPublicSubnetC:
Type: AWS::EC2::Subnet
Properties:
VpcId: !Ref DevPostgreSQLIssueTestVPC
AvailabilityZone: !Select [ 2, !GetAZs '' ]
CidrBlock: 10.10.1.0/24
Tags:
- Key: Name
Value: DevPostgreSQLIssueTestPublicSubnetC
# Private Subnet A 선언
DevPostgreSQLIssueTestPrivateSubnetA:
Type: AWS::EC2::Subnet
Properties:
VpcId: !Ref DevPostgreSQLIssueTestVPC
AvailabilityZone: !Select [ 0, !GetAZs '' ]
CidrBlock: 10.10.10.0/24
Tags:
- Key: Name
Value: DevPostgreSQLIssueTestPrivateSubnetA
# Private Subnet C 선언
DevPostgreSQLIssueTestPrivateSubnetC:
Type: AWS::EC2::Subnet
Properties:
VpcId: !Ref DevPostgreSQLIssueTestVPC
AvailabilityZone: !Select [ 2, !GetAZs '' ]
CidrBlock: 10.10.11.0/24
Tags:
- Key: Name
Value: DevPostgreSQLIssueTestPrivateSubnetC
# IGW 선언
DevPostgreSQLIssueTestIGW:
Type: AWS::EC2::InternetGateway
Properties:
Tags:
- Key: Name
Value: DevPostgreSQLIssueTestIGW
# IGW를 VPC에 연결
DevPostgreSQLIssueTestIGWAttachment:
Type: AWS::EC2::VPCGatewayAttachment
Properties:
InternetGatewayId: !Ref DevPostgreSQLIssueTestIGW
VpcId: !Ref DevPostgreSQLIssueTestVPC
# NAT Gateway를 위한 EIP 선언
DevPostgreSQLIssueTestNATEIP:
Type: AWS::EC2::EIP
# NAT Gateway생성 및 EIP와 Subnet 연결
DevPostgreSQLIssueTestNATGateway:
Type: AWS::EC2::NatGateway
DependsOn:
- DevPostgreSQLIssueTestIGWAttachment
- DevPostgreSQLIssueTestPublicSubnetA
- DevPostgreSQLIssueTestNATEIP
Properties:
AllocationId: !GetAtt DevPostgreSQLIssueTestNATEIP.AllocationId
SubnetId: !Ref DevPostgreSQLIssueTestPublicSubnetA
Tags:
- Key: Name
Value: DevPostgreSQLIssueTestNATGateway
# Public Subnet용 RoutingTable 선언
DevPostgreSQLIssueTestPublicRT:
Type: AWS::EC2::RouteTable
Properties:
VpcId: !Ref DevPostgreSQLIssueTestVPC
Tags:
- Key: Name
Value: DevPostgreSQLIssueTestPublicRT
# Public RoutingTable의 Default Route 경로(IGW) 추가
DevPostgreSQLIssueTestPublicDefaultRoute:
Type: AWS::EC2::Route
DependsOn: DevPostgreSQLIssueTestIGWAttachment
Properties:
RouteTableId: !Ref DevPostgreSQLIssueTestPublicRT
DestinationCidrBlock: 0.0.0.0/0
GatewayId: !Ref DevPostgreSQLIssueTestIGW
# Public Subnet에 Public RoutingTable 연결
DevPostgreSQLIssueTestPublicSubnetRouteTableAssociationA:
Type: AWS::EC2::SubnetRouteTableAssociation
Properties:
RouteTableId: !Ref DevPostgreSQLIssueTestPublicRT
SubnetId: !Ref DevPostgreSQLIssueTestPublicSubnetA
DevPostgreSQLIssueTestPublicSubnetRouteTableAssociationC:
Type: AWS::EC2::SubnetRouteTableAssociation
Properties:
RouteTableId: !Ref DevPostgreSQLIssueTestPublicRT
SubnetId: !Ref DevPostgreSQLIssueTestPublicSubnetC
# Private Subnet용 RoutingTable 선언
DevPostgreSQLIssueTestPrivateRT:
Type: AWS::EC2::RouteTable
Properties:
VpcId: !Ref DevPostgreSQLIssueTestVPC
Tags:
- Key: Name
Value: DevPostgreSQLIssueTestPrivateRT
# Private RoutingTable의 Default Route 경로(NatGateway) 추가
DevPostgreSQLIssueTestPrivateDefaultRoute:
Type: AWS::EC2::Route
DependsOn: DevPostgreSQLIssueTestIGWAttachment
Properties:
RouteTableId: !Ref DevPostgreSQLIssueTestPrivateRT
DestinationCidrBlock: 0.0.0.0/0
NatGatewayId: !Ref DevPostgreSQLIssueTestNATGateway
# Private Subnet에 Private RoutingTable 연결
DevPostgreSQLIssueTestPrivateSubnetRouteTableAssociationA:
Type: AWS::EC2::SubnetRouteTableAssociation
Properties:
RouteTableId: !Ref DevPostgreSQLIssueTestPrivateRT
SubnetId: !Ref DevPostgreSQLIssueTestPrivateSubnetA
DevPostgreSQLIssueTestPrivateSubnetRouteTableAssociationC:
Type: AWS::EC2::SubnetRouteTableAssociation
Properties:
RouteTableId: !Ref DevPostgreSQLIssueTestPrivateRT
SubnetId: !Ref DevPostgreSQLIssueTestPrivateSubnetC
# EC2에 적용할 보안그룹 선언
DevPostgreSQLIssueTestEC2SG:
Type: AWS::EC2::SecurityGroup
Properties:
GroupDescription: Enable HTTP access via port 80 and SSH access via port 22
VpcId: !Ref DevPostgreSQLIssueTestVPC
Tags:
- Key: Name
Value: DevPostgreSQLIssueTestEC2SG
SecurityGroupIngress:
- IpProtocol: tcp
FromPort: '80'
ToPort: '80'
CidrIp: 0.0.0.0/0
- IpProtocol: tcp
FromPort: '5432'
ToPort: '5432'
CidrIp: 0.0.0.0/0
- IpProtocol: tcp
FromPort: '22'
ToPort: '22'
CidrIp: 0.0.0.0/0
- IpProtocol: icmp
FromPort: -1
ToPort: -1
CidrIp: 0.0.0.0/0
# PostgreSQL EC2 A,C 선언
# Bastion에서 접속하기 쉽게 root 패스워드를 변경하고 패스워드 로그인 설정을 활성화한다
# PostgreSQL을 설치하고 외부접속이 가능하도록 설정한다
# NLB 로드밸런싱을 위해 A존과 C존에 각각 생성한다
DevPostgreSQLIssueTestEC2A:
Type: AWS::EC2::Instance
Properties:
InstanceType: t2.micro
ImageId: ami-03b42693dc6a7dc35
KeyName: !Ref KeyName
Tags:
- Key: Name
Value: DevPostgreSQLIssueTestEC2A
NetworkInterfaces:
- DeviceIndex: 0
SubnetId: !Ref DevPostgreSQLIssueTestPrivateSubnetA
GroupSet:
- !Ref DevPostgreSQLIssueTestEC2SG
AssociatePublicIpAddress: false
UserData:
Fn::Base64:
!Sub |
#!/bin/bash
hostname DevPostgreSQLIssueTestEC2A
(
echo "test1234%"
echo "test1234%"
) | passwd --stdin root
sed -i "s/^PasswordAuthentication no/PasswordAuthentication yes/g" /etc/ssh/sshd_config
sed -i "s/^#PermitRootLogin yes/PermitRootLogin yes/g" /etc/ssh/sshd_config
service sshd restart
amazon-linux-extras install epel -y
tee /etc/yum.repos.d/pgdg.repo<<"EOF"
[pgdg13]
name=PostgreSQL 13 for RHEL/CentOS 7 - x86_64
baseurl=http://download.postgresql.org/pub/repos/yum/13/redhat/rhel-7-x86_64
enabled=1
gpgcheck=0
EOF
yum install postgresql13 postgresql13-server -y
/usr/pgsql-13/bin/postgresql-13-setup initdb
systemctl enable --now postgresql-13
systemctl status postgresql-13
su - postgres
sed -i "s/#listen_addresses = 'localhost'/listen_addresses = '*'/g" /var/lib/pgsql/13/data/postgresql.conf;
sed -i "s/#port = 5432/port = 5432/g" /var/lib/pgsql/13/data/postgresql.conf
sed -i "s=host all all ::1/128=host all all 0.0.0.0/0=g" /var/lib/pgsql/13/data/pg_hba.conf;
su - postgres <<"EOPostgreSQL"
psql -U postgres -c "create database testdb;"
psql -U postgres -c "create user test with encrypted password 'test1234%';"
psql -U postgres -c "grant all privileges on database testdb to test;"
EOPostgreSQL
systemctl restart postgresql-13
DevPostgreSQLIssueTestEC2C:
Type: AWS::EC2::Instance
Properties:
InstanceType: t2.micro
ImageId: ami-03b42693dc6a7dc35
KeyName: !Ref KeyName
Tags:
- Key: Name
Value: DevPostgreSQLIssueTestEC2C
NetworkInterfaces:
- DeviceIndex: 0
SubnetId: !Ref DevPostgreSQLIssueTestPrivateSubnetC
GroupSet:
- !Ref DevPostgreSQLIssueTestEC2SG
AssociatePublicIpAddress: false
UserData:
Fn::Base64:
!Sub |
#!/bin/bash
hostname DevPostgreSQLIssueTestEC2C
(
echo "test1234%"
echo "test1234%"
) | passwd --stdin root
sed -i "s/^PasswordAuthentication no/PasswordAuthentication yes/g" /etc/ssh/sshd_config
sed -i "s/^#PermitRootLogin yes/PermitRootLogin yes/g" /etc/ssh/sshd_config
service sshd restart
amazon-linux-extras install epel -y
tee /etc/yum.repos.d/pgdg.repo<<"EOF"
[pgdg13]
name=PostgreSQL 13 for RHEL/CentOS 7 - x86_64
baseurl=http://download.postgresql.org/pub/repos/yum/13/redhat/rhel-7-x86_64
enabled=1
gpgcheck=0
EOF
yum install postgresql13 postgresql13-server -y
/usr/pgsql-13/bin/postgresql-13-setup initdb
systemctl enable --now postgresql-13
systemctl status postgresql-13
su - postgres
sed -i "s/#listen_addresses = 'localhost'/listen_addresses = '*'/g" /var/lib/pgsql/13/data/postgresql.conf;
sed -i "s/#port = 5432/port = 5432/g" /var/lib/pgsql/13/data/postgresql.conf
sed -i "s=host all all ::1/128=host all all 0.0.0.0/0=g" /var/lib/pgsql/13/data/pg_hba.conf;
su - postgres <<"EOPostgreSQL"
psql -U postgres -c "create database testdb;"
psql -U postgres -c "create user test with encrypted password 'test1234%';"
psql -U postgres -c "grant all privileges on database testdb to test;"
EOPostgreSQL
systemctl restart postgresql-13
# 외부에서 접속하기 위한 Bastion 서버
DevPostgreSQLIssueTestBastionC:
Type: AWS::EC2::Instance
Properties:
InstanceType: t2.micro
ImageId: ami-03b42693dc6a7dc35
KeyName: !Ref KeyName
Tags:
- Key: Name
Value: DevPostgreSQLIssueTestBastionC
NetworkInterfaces:
- DeviceIndex: 0
SubnetId: !Ref DevPostgreSQLIssueTestPublicSubnetC
GroupSet:
- !Ref DevPostgreSQLIssueTestEC2SG
AssociatePublicIpAddress: true
UserData:
Fn::Base64:
!Sub |
#!/bin/bash
hostname DevPostgreSQLIssueTestBastionC
# Bastion 서버에 붙일 EIP 선언
DevPostgreSQLIssueTestBastionCEIP:
Type: AWS::EC2::EIP
Properties:
InstanceId: !Ref DevPostgreSQLIssueTestBastionC
# NLB 선언
DevPostgreSQLIssueTestNLB:
Type: "AWS::ElasticLoadBalancingV2::LoadBalancer"
Properties:
Type: "network"
Scheme: "internet-facing"
IpAddressType: "ipv4"
SubnetMappings:
- SubnetId: !Ref DevPostgreSQLIssueTestPublicSubnetA
- SubnetId: !Ref DevPostgreSQLIssueTestPublicSubnetC
LoadBalancerAttributes:
- Key: "deletion_protection.enabled"
Value: false
- Key: "access_logs.s3.enabled"
Value: false
- Key: "load_balancing.cross_zone.enabled"
Value: true
Tags:
- Key: Name
Value: DevPostgreSQLIssueTestNLB
# NLB에 연결할 대상 그룹 선언
DevPostgreSQLIssueTestNLBTargetGroup:
Type: AWS::ElasticLoadBalancingV2::TargetGroup
Properties:
Port: 5432
Protocol: TCP
Targets:
- Id: !Ref DevPostgreSQLIssueTestEC2A
- Id: !Ref DevPostgreSQLIssueTestEC2C
TargetType: instance
VpcId: !Ref DevPostgreSQLIssueTestVPC
# NLB Listner 설정 및 중계할 대상 설정
DevPostgreSQLIssueTestNLBListener1:
Type: "AWS::ElasticLoadBalancingV2::Listener"
Properties:
LoadBalancerArn: !Ref DevPostgreSQLIssueTestNLB
Protocol: "TCP"
Port: 80
DefaultActions:
- Type: "forward"
ForwardConfig:
TargetGroups:
- TargetGroupArn: !Ref DevPostgreSQLIssueTestNLBTargetGroup
# 출력
Outputs:
DevPostgreSQLIssueTestEC2AIPAddress:
Value: !GetAtt DevPostgreSQLIssueTestEC2A.PrivateIp
Export:
Name: "DevPostgreSQLIssueTestEC2AIP::Address"
DevPostgreSQLIssueTestBastionCEIPAddress:
Value: !Ref DevPostgreSQLIssueTestBastionCEIP
Export:
Name: "DevPostgreSQLIssueTestBastionCEIP::Address"
DevPostgreSQLIssueTestNLBDNSName:
Value: !GetAtt DevPostgreSQLIssueTestNLB.DNSName
Export:
Name: "DevPostgreSQLIssueTestNLBDNSName::DNSName"
구성완료 후 접속 테스트 시 세션이 끊어질 만한 현상이나 특이점이 확인되지 않았다.
이슈를 재현하기 위해 여러 설정을 확인하던 중 NLB TargetGroup의 Cross-zone load balancing 설정(NLB는 기본 비활성화)이 있어 활성화 후 테스트를 해보았다.
해당 설정으로 인해 하나의 프로그램에서 접속한 세션이 A와 C존의 DB에 각각 따로 접속되었다.
원인파악 상세 (추측)
정확한 이슈는 재현되지 않아 원인파악이 어려운 상황이다.
추측하기로는 해당 TargetGroup의 Cross-zone 설정이 활성화되어 부모세션과 자식 세션이 나뉘어서 로드밸런싱이 되지 않았을까 추측한다.
DBeaver의 경우 메타데이터와 쿼리 조회를 위한 세션이기에 따로 접속되어도 접속에는 문제가 없었으나, 각 세션이 부모자식관계에 있거나 하는 경우 세션이 끊기는 현상이 있을 수도 있을 것 같다.
해결방법 (추측)
TargetGroup의 Cross-zone 설정 활성화가 원인으로 예상되나 테스트 시 이슈가 완벽하게 재현되지 않았고,
현재는 NLB 중계를 제거하여 이슈가 발생하지 않는다고 하니 정확한 해결방법은 확인이 어렵다.
(이슈가 발생했을 때 NLB를 제외한 Direct Connection 시에는 이슈가 발생하지 않아, NLB 문제로 판단하고 NLB 중계를 제거한 것 같은데 몇몇 NLB설정을 테스트해봤으면 더 좋았을 것 같다.)
참고사이트
- AWS EC2 userdata script : https://stackoverflow.com/questions/71418987/aws-ec2-userdata-script-how-create-postgres-user-password-and-database
- PostgreSQL 구축 방법 Amazon Linux 2 EC2 : https://nulls.co.kr/bones-skins/454
- NLB Cloudformation 생성 시 참고 사이트 : https://asecure.cloud/a/NetworkLoadBalancer/
- NLB Cloudformation 생성 시 참고 사이트2 : https://stackoverflow.com/questions/71879921/aws-network-loadbalancer-listener-and-the-following-target-groups-have-incompati
- RDS Cloudformation 생성 시 참고 사이트 : https://velog.io/@yuran3391/eks
- PostgreSQL 쿼리 참고 : https://blusky10.tistory.com/360