AWS Lambda : good to know in advance
When I think of Lambda, an AWS serverless service, it seemed like a magical box. Once I create a lambda function, I imaged that AWS will do everything that I need except for business logic. However, I found that there are several things to think of when you are about to decide to go with lambda.
1. Cold start
Before reading the official documents, the word "serverless" meant to me that containers are already provisioned and waiting for me to deploy the program. Honestly, I expected AWS to provide me warmed-up computing power.
However, it turned out that AWS doesn't maintain containers. So the first invoke will take too long and if you attach VPC to the lambda function, elapsing time become worse. In my experiences, the average elapsed time to invoke function at first was about 18~20 secs. All the thing my function did was residing in VPC and querying to empty RDS database table. After that it took 300~500ms. ( Please be noted that I invoke lambda function in AWS region eu-west1 from South Korea. )
2. Concurrency
If I invoke the same lambda function simultaneously 10 times at first, 10 containers cold start immediately. Each invocation takes 18~20 sec and that is unacceptable. If I employ EC2 instead, all of the 10 requests would end up in 500 ms at most.
There is a workaround that is keeping lambda containers warmed-up. But it could cause the unnecessary cost to make a meaningless invocation periodically. The cost could be negligible but I have to ask myself what's the point of using lambda.
3. Resource Pooling
Some lambda functions need to connect to RDS Database and others might need an HTTP connection. Since each lambda function behaves like a one-time called single thread, it seems to me that source pooling is useless.
I know we could keep source pool by having it outside of the handler. And I might take some advantage of the source pool if I am lucky. Nonetheless, we can't figure out which container to use. We can't catch the event of container destroy and neither of whether the maximum limit was reached or not. Even, how can I assure that the connection I get from the pool is not stale? I think creating and releasing a resource inside a handler is the best way so far.
I hope AWS would provide a real serverless service in the future. Until then, I will struggle to find the best practice to use lambda functions.
1. Cold start
Before reading the official documents, the word "serverless" meant to me that containers are already provisioned and waiting for me to deploy the program. Honestly, I expected AWS to provide me warmed-up computing power.
However, it turned out that AWS doesn't maintain containers. So the first invoke will take too long and if you attach VPC to the lambda function, elapsing time become worse. In my experiences, the average elapsed time to invoke function at first was about 18~20 secs. All the thing my function did was residing in VPC and querying to empty RDS database table. After that it took 300~500ms. ( Please be noted that I invoke lambda function in AWS region eu-west1 from South Korea. )
2. Concurrency
If I invoke the same lambda function simultaneously 10 times at first, 10 containers cold start immediately. Each invocation takes 18~20 sec and that is unacceptable. If I employ EC2 instead, all of the 10 requests would end up in 500 ms at most.
There is a workaround that is keeping lambda containers warmed-up. But it could cause the unnecessary cost to make a meaningless invocation periodically. The cost could be negligible but I have to ask myself what's the point of using lambda.
3. Resource Pooling
Some lambda functions need to connect to RDS Database and others might need an HTTP connection. Since each lambda function behaves like a one-time called single thread, it seems to me that source pooling is useless.
I know we could keep source pool by having it outside of the handler. And I might take some advantage of the source pool if I am lucky. Nonetheless, we can't figure out which container to use. We can't catch the event of container destroy and neither of whether the maximum limit was reached or not. Even, how can I assure that the connection I get from the pool is not stale? I think creating and releasing a resource inside a handler is the best way so far.
I hope AWS would provide a real serverless service in the future. Until then, I will struggle to find the best practice to use lambda functions.
Comments
Post a Comment